Repositorio Dspace

Wikigrep distribuido: búsquedas avanzadas en la wikipedia

Mostrar el registro sencillo del ítem

dc.contributor.author Varas Palomeque, Irene Carolina
dc.contributor.author Paladines Herrera, Gabriel Antonio
dc.contributor.author Abad, Cristina
dc.date.accessioned 2009-10-15
dc.date.available 2009-10-15
dc.date.issued 2009-10-15
dc.identifier.uri http://www.dspace.espol.edu.ec/handle/123456789/7701
dc.description.abstract In this project we created a regular expressions search engine that uses the Wikipedia database of articles. The system allows the use of to enter a regular expression and makes an asynchronous request to initialize an EC2 cluster; it searches for the pattern inside all the Wikipedia and then returns the result, displaying a list of all the occurrences of the pattern and a link to the Wikipedia Article. We used the Amazon Web Services, Java libraries to manipulate Wikipedia Articles, the Hadoop framework and a dataset of the Wikipedia Articles. We tested some regular expressions that couldn’t be searched for using neither traditional search engines nor the Wikipedia Search Engine. Our tests show that an advanced search engine could be cheap to implement providing high scalability through the use of cloud computing and data-intensive computing techniques. en
dc.language.iso spa en
dc.rights openAccess
dc.subject HADDOP en
dc.subject CLOUD COMPUTING en
dc.subject MAPREDUCE en
dc.subject ELASTIC MAPREDUCE en
dc.subject SIMPLE STORAGE SERVICE S3 en
dc.subject WIKIPEDIA en
dc.subject DATASET en
dc.subject CLÚSTER EC2. en
dc.title Wikigrep distribuido: búsquedas avanzadas en la wikipedia en
dc.type Article en


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Buscar en DSpace


Búsqueda avanzada

Listar

Mi cuenta