Instructor(s):

Roland Molontay
Weeks
7-14
Contact hours
2x2 hours
Credit
2 credits

Short Description of the Course:
"What data scientists do is make discoveries while swimming in data", as described by the Harvard Business Review.  In the second part of the course, we learn advanced techniques including kernel methods, recommender systems, network centrality, in addition to getting introduced to Big Data tools such as Hadoop. During the course, we will have guest lectures by data scientists from companies in the Budapest area. Students will have the option to define their  data mining projects and work in teams during the semester.

Aim of the Course:
The aim of the course is to discuss advanced techniques of data mining with useful knowledge of related disciplines supporting real-world, especially bioinformatics data mining projects. By the end of the course, students will be able to analyze biological (genomic, microarray, pathway, protein, chemical) data sets using complex data mining methods.

Prerequisites:
The course requires basic knowledge in data mining. (See also the course Data Mining: Models and Algorithms) Background in probability theory, linear algebra and programming is important.

Detailed Program and Class Schedule:

  • Advanced classification methods: Bagging, boosting, AdaBoost.
  • More models and algorithms for classification: neural networks, linear separation methods, support vector machine (SVM).
  • Random forest.
  • Recommender systems. Collaborative filtering. Implicit and explicit recommendation.
  • Search engines, web information retrieval, PageRank and network mining.
  • Selected topics connected to student projects (e.g. Mining biological, scientific, social media data)
  • Final test.

Method of Instruction:
Handouts, presentations, IPython Notebooks, relevant research papers, web page, course mailing list and Wiki. Weekly regular office hour for consultations.

Textbooks:
Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Addison-Wesley, 2006.

Jure Leskovec, Anand Rajaraman, Jeff Ullman: Mining of Massive Datasets

http://www.mmds.org/

Instructors' bio:

Roland Molontay (born 1991) obtained his PhD degree in network and data science from Budapest University of Technology and Economics (BME). He was a visiting PhD student at Brown University in 2016. Currently he holds a research position at MTA-BME Stochastics Research Group and he also teaches mathematics and data science at BME for undergraduate and graduate students. He has been participating in many successful data intensive R&D projects with renowned companies (such as NOKIA-Bell Labs) throughout the years. He has been awarded the Gyula Farkas Memorial Prize in 2020 for his outstanding work in applied mathematics. He is the founder and leader of the Human and Social Data Science Lab at BME.

Students' Review About This Course

"The Data Science course solidified my decision to pursue a career in the field. Professor Molontay engaged us with the material really well as we discussed topics from gradient boosting to artificial neural networks. Professor Molontay even mentored my class project group in transforming our final project into a research paper which has been accepted into the journal Applied Network Science."

Tiernon Riesenmy

Tiernon Riesenmy

The University of Kansas

"Data Science was a great introduction to how to gather and manage big datasets! Professor Molontay gave a great overview of all the algorithms one can use to extract information from these datasets and clearly explained how these algorithms manage to do so. It was a super rewarding class and inspired me to explore topics in Machine Learning further!"

Kiersten Campbell

Kiersten Campbell

Williams College

"Data Science is a great class. It is taught very well. Prof. Molontay was constantly checking in on our progress and how we were doing. I think that was really helpful. He obviously cared about how much the students were learning and that we were actually grasping the concepts and not just getting by. He was always in very close contact with the students which was good."

Kate Barnes

Kate Barnes

Colorado College