On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets

Ocsa, Alexander; Huillca, Jose Luis; López Del Alamo, Cristian

Ver/

link_articulo.txt (43bytes)

Fecha

2018-07-04

Autor

Ocsa, Alexander

Huillca, Jose Luis

López Del Alamo, Cristian

Metadatos

Mostrar el registro completo del ítem

Resumen

Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.

URI

http://repositorio.ulasalle.edu.pe/handle/20.500.12953/30

Colecciones

Ciencia de la Computación