Data Extraction Approach for Aggregator Platforms

Anand Fadte

Authors

Anand Fadte

Abstract

The purpose of this Research proposal is to find an optimal solution for data extraction for aggregator platforms, in this review, multiple research papers are involved with the pros and cons of using them. First, let us start by discussing the aggregator platform. So, what do we mean by the aggregator model? An aggregator business model is like a network model where a firm collects information about a particular good or service provider and it makes the provider their partner Gosh (Ghosh, 2022, p. 1). Some of the companies which are using the aggregator business model are Amazon, Flipkart, etc. (Ecommerce, n.d.).

Now, here are the advantages of using this model. First, it is very cost-effective and consumes a very less amount of time to set up, there is no inventory cost involved in this instead, the company which owns that platform can focus more on marketing and user experience Miles (Miles, 2022, p. 2). But there are challenges with this type of business model too, the very first challenge which comes is the dependency on other platforms such as Google, Facebook, etc. If tomorrow they launch their platform, then they can change their algorithm and show ads and rank their website on top. The other challenge is accumulating data from various sites, and companies on a single platform. Apart from API, scraping data from various websites and presenting it on the portal is also a challenging task.
So, what are the major solution approaches that I will be discussing in this Research proposal? Using techniques such as Simple Object Access Protocol (SOAP) and Representational State Transfer (REST) to generate Application Programming Interfaces. Computer Vision (CV) and Natural Language Processing (NLP) extract essential information from any source.
In today's fast-paced digital world, staying up-to-date with the latest website changes is crucial for various reasons, including staying competitive, ensuring the accuracy of information, and maintaining compliance with regulations. However, the dynamic nature of online content makes it challenging to keep track of every minute change that occurs. Fortunately, advancements in technology, particularly in the field of computer vision, offer solutions to address this issue.
The development of a system capable of monitoring changes on websites in near this system can analyse specific web pages and detect any alterations that occur within them. This approach involves capturing screenshots of the targeted web pages at regular intervals and comparing them to identify differences.
One of the key components of this system is its ability to focus on specific links or sections of websites based on predefined criteria. This targeted approach allows for efficient monitoring of relevant content without the need to scan entire websites indiscriminately. Using computer vision algorithms, the system can accurately identify and isolate the desired elements within web pages for analysis.
Once the screenshots are captured, the system employs various algorithms to compare them and detect any changes that have occurred. This process may involve techniques such as image differencing, pattern recognition, and optical character recognition (OCR) to identify text, images, or layout alterations. By comparing screenshots taken at different points in time, the system can pinpoint the exact changes that have occurred, no matter how subtle they may be.
In addition to detecting changes, the system also provides mechanisms for notification and reporting. When significant alterations are detected, the system can generate alerts or notifications to inform relevant stakeholders. These notifications may include details about the nature of the changes, the affected web pages, and the timeframe in which they occurred. Furthermore, the system may generate reports summarising the detected changes for further analysis or documentation purposes.

The application of computer vision techniques in website monitoring offers several benefits. Firstly, it allows the automated and continuous monitoring of web content, reducing the need for manual intervention and oversight. This not only saves time and resources but also ensures timely detection of changes. Additionally, by focusing on specific links or sections of websites, the system can tailor its monitoring efforts to the needs of individual users or organisations, enhancing their efficiency and relevance.
Moreover, the ability to accurately detect and track changes on websites can have significant implications across various industries and domains. For example, in the e- commerce sector, monitoring product pages for price changes or availability updates can help retailers stay competitive and optimise pricing strategies. In the financial services industry, tracking changes on regulatory websites can ensure compliance with evolving regulations and mitigate risks.
Overall, the development of a system capable of monitoring changes on websites using computer vision techniques represents a valuable tool for businesses, researchers, and other stakeholders. By leveraging the power of automation and artificial intelligence, this system enables proactive monitoring and timely response to changes in online content, ultimately contributing to enhanced efficiency, accuracy, and competitiveness in the digital landscape.

Data Extraction Approach for Aggregator Platforms

Authors

Abstract

Downloads

Published

How to Cite

Issue

Section

Make a Submission

Information

Current Issue