![]() ![]() To achieve this, crawlers need to be endowed with some features that go beyond merely following links, such as the ability to automatically discover search forms that are entry points to the deep Web, fill in such forms, and follow certain paths to reach the deep Web pages with relevant information. The integration result is a new big data source that combines big data from several critical sources in the biology domain and transforms it into one unified format to help researchers and specialists use it for further research and analysis.ĭeep Web crawling refers to the problem of traversing the collection of pages in a deep Web site, which are dynamically generated in response to a particular query that is submitted using a search form. The results also do not violate any logical consistency rules, passing all the logical consistency tests, such as Jena Ontology API, HermiT, and Pellet reasoners. The results are equivalent in terms of the ontology size before the integration in the number of added items, skipped items, and overlapped items in the ontology size after the integration and in the number of edges, vertices, and roots. The integration resulted in the same result as that obtained from the local integration. Based on these parameters, we proposed, implemented, and tested a big data integration framework that integrates big data in the biology domain, based on the domain ontology and using distributed processing. Unfortunately, there is no general or standardized integration process the nature of an integration process depends on the data type, domain, and integration purpose. Furthermore, in most of applications and research, a single big data source is not enough to complete the analysis and achieve goals. ![]() ![]() Massive heterogeneous big data residing at different sites with various types and formats need to be integrated into a single unified view before starting data mining processes. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |