Jobs India Maharashtra Pune Full Time (40hrs/week)

Web Scrapper - Python

Date Posted: Jun 11, 2022, 1 Candidate Already Applied

Full Time (40hrs/week)

More than 1 year

Pune, Maharashtra, India

Apply Now Email to Friend Add to Favourite

Skills Required

Job Description

Job Description: Web Scrapper - Python

Hours: Monday – Friday (some hours outside of this as required)

Work hours: EST shift (6:00 pm to 3:00 am). It will be a fixed shift.

Location: Pune, IT Cerebrum Park, Kalyani Nagar

Length: Permanent position, with three-month probation period

Salary:

About the Company

Valasys Media is a Global Integrated Marketing and Sales process outsourcing company that specializes in helping companies to build sales pipeline with qualified opportunities and reduce their sales cycle for their products/services portfolio. As part of our capability, we also help create market visibility, build awareness, and establish business relationships in new markets.

Job brief

We are looking for a Web Scrapper-Python to help us to expand and optimize our data as well as optimize data flow. The ideal candidate will be responsible for extracting and ingesting data from websites/URLs using web crawling/Scrapping tools. In this role you will own the creation process of these tools, services, and workflows to improve crawling/Scrapping of data and management of database.

To do this job successfully, you need exceptional skills in programming and web. Knowledge of data science and software engineering candidate will have added advantage. Your ultimate goal will be maintained dataflow with scraping, crawling and cleaning data as per requirement.

Key skills: Web Scrapping, Web Crawling, Web and Windows Automation, Python/R, Selenium, NLP, Data Extraction, SQL/No SQL, OpenCV, Auto IT, PyAutoGUI

Requirements and skills

Proven experience as Web Scrapper/Crawler or similar role
Have strong understanding and working knowledge of web crawlers, web scrapers and other automation tools, to help browse the web content
Knowledge of web scraping and tools
Strong knowledge of any of multiple open-source and proprietary scraping frameworks available
Hands-on-experience with SQL/NO-SQL (MySQL/ Postgres/Cassandra /MongoDB)
Good knowledge and coding experience in one or more programming languages such as Python, Java, JavaScript
Experience of creating scrapy spiders for websites with Captcha, IP ban, geolocation ban, Cloudflare / Distil / Imperva firewalls, sites required login to access data, Dynamic websites loading through JS / REST API / Graphql etc.
Knowledge of Object-oriented programming
Experience with applications designed to display archived web content
Experience with AWS cloud services (EC2)
Python Tech stack (Python libraries - scrapy, requests, Urllib, Beautiful soup, splash, Selenium, pandas)
2-4 years’ experience with a Bachelor's Degree in Computer Science, Engineering, Technology or related field required

Responsibilities

Program and apply your knowledge set to fetch data from multiple online sources, cleanse it
Develop application frameworks for automating and maintaining constant flow of data from multiple sources
Design, build web crawlers to scrape data and URLs by using Python modules [scrapy, selenium, requests, Beautiful Soup, splash, etc.]
Create crawlers for all types of websites irrespective of the technical roadblocks.
Manage the crawlers to overcome technical challenges like IP ban, geolocation ban, captcha and bot blocking services
Design scrapy pipelines to connect the crawler output to MySQL database
Integrate the data crawled and scraped into our databases
Build and maintain high quality reusable code
Automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability

Web Scrapper - Python

Skills Required

Job Description

Quick Links

Contact Us

Addresses

Subscribe to Newsletter