80legs is used by hundreds of companies to crawl and extract data from the Web. Here are just a few of our awesome clients:
80legs gives Adify the scalability needed to analyze millions of websites on the internet through their flexible and easy to use API, as well as their responsive and thoughtful support staff.
Jim Larrison, Chief Revenue Officer, Adify
80legs has helped us enhance the performance of our product by allowing us to perform massive web crawling and data extraction within a short period of time.
Hugues de Mazancourt, Lingway
Good content can be hard to find. As individuals, we spend several hours each week browsing through web pages and trying to find content that's interesting. Advertising networks know that, and they want to make sure the ads they deliver are displayed next to the best and most interesting content. To do this, they crawl the web, trying to identify websites with interesting content. The more websites they know about, the more potential ad channels they have.
Adify is one of the top advertising networks in the country, and they use 80legs to help power their web crawling and analysis of interesting web content as a component of their market mappingTM methodology. To do this, Adify has created their own custom 80legs code to process the content of a web page and determine whether or not the web page or domain provides interesting and relevant content. Over time, they have built several applications on the 80legs platform to tell them whether or not a domain fits potential advertisers' needs. With the scale and customization provided by 80legs, Adify can do this quickly, easily and cost-effectively.
Adify has crawled over 50 million targeted websites with 80legs. When coupled with Adify's proprietary insights and other industry leading sources of analytics, these crawls help expand Adify's extensive database of websites and create a comprehensive map of potential advertising channels on the web. By mapping the Internet in this manner and creating market mapsTM, Adify is able to provide their customers strategic guidance on content monetization.
As millions of web pages are created every day, IP protection is an ever-growing concern for content creators. While most foks associate IP protection with things like music and movies, these aren't the only types of content that need to be protected. Monotype Imaging uses IP protection services to track the usage of font types across the web.
In order to assist its IP protection services, Monotype Imaging uses 80legs to run incredibly large scans of the web. These scans crawl across tens of thousands of popular domains and identify the location of fonts on the web pages of these domains. 80legs uses a proprietary algorithm, provided by Monotype and converted to an 80app, to check these files and extract metadata from them. Using this information, Monotype can essentially run a gigantic data collection survey of how where particular fonts are used on the web.
The web crawl run by 80legs processes 80 million URLs in about 2 days and updates its findings on a monthly basis, though it could update more frequently if necessary. This kind of powerful web crawling enables Monotype to stay up to date and gives them unsurpassed competitive and customer intelligence.
Extractiv uses powerful semantic analysis software on top of 80legs web crawls to turn any web content into structured semantic data. Extractiv can automatically determine the "subject" of any web page.
Extractiv packs robust semantic and NLP algorithms into custom 80apps. When Extractiv runs a job, it builds a unique 80app based on the topic it’s supposed to track and feeds it into an 80legs web crawl. That web crawl can run over blogs, news sites and any other discussion on the Web and convert any unstructured text data into structured semantic data.
With 80legs’ scalable web-crawling, Extractiv can process large volumes of Web content and generate a complete understanding of the topics discussed on any web page.
Sentiment analysis is in big demand these days. Lingway uses natural language processing (NLP) to understand how people feel about various brands. Lingway specializes in processing text data, but they rely on the specialty of 80legs to gather that data from the Web.
Here's how Lingway's workflow handles data extraction and collection:
- Search engines are used to generate a list of URLs related to given keywords about a brand.
- The URL list is uploaded to 80legs as a seed list, and a web crawl is started from this seed list.
- During the web crawl, a custom data extractor (aka "80app") is used to process and cleanup the text content of a web page.
- The results generated by the 80legs web crawl are then fed into Lingway's NLP tools, which determine sentiment.
The 80legs API and 80app framework, along with the raw bandwidth and web crawling speed provided by 80legs, lets Lingway crawl the web in a very short time for any given topic. 80legs helps Lingway with massive distributed data cleanup and enhances the performance of its own product.
Ready to join these companies in going Web-Scale? |