Latent Dirichlet Allocation for Network Intrusion on Apache Spot
Gustavo A. Lujan Moreno, Data Scientist, INTEL

Resumen: The detection of anomalies in network traffic can assist network analysts to prevent or mitigate potential threats. Latent Dirichlet Allocation (LDA) is a generative probabilistic model used in discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. LDA can be applied to network traffic in which we create our own words based on network information and where documents are replaced with IP’s. LDA mathematical theory, word creation in terms of netflows, implementation and performance evaluation using Apache Spot will be introduced.

Biografía:
Gustavo A. Lujan was born in Los Mochis, Sinaloa, Mexico, in 1980. He received his B.S degree in industrial engineering from the Instituto Tecnologico de Sonora, Ciudad Obregon, Sonora, Mexico in 2001, a M.E. degree in quality and productivity from the ITESM, Monterrey, Nuevo Leon, Mexico in 2008 and a M.S. in industrial engineering from Arizona State University, Tempe, Arizona in 2013. He was certified Six Sigma Black Belt in 2008 from the same university. He is currently a PhD candidate in the industrial engineering program at the Arizona State University. From 2001 to 2010, he worked for Cerveceria Cuauhtemoc Moctezuma (now Heineken Mexico) holding different job positions. From 2011 to 2013 he worked at the Speech and Hearing Department at Arizona State University as Database Manager. After his return from the USA he joined Oracle as a Problem Manager and he is currently working at Intel as a Data Scientist in Zapopan, Jalisco, Mexico. His research interests include statistical and machine learning, data mining, statistical process control, design of experiments and Six Sigma.

Organiza: 
División Académica de Ingeniería
Ubicación: 
Sala de Maestros planta alta, Río Hondo
Correo electrónico: 
Extensión o teléfono: 
5628 4000 ext. 3614