COT6930 – Big Data Analytics with Hadoop


Contact / Questions

TA: Aaron Richter,

There is a Piazza forum set up for this class (Click the Q&A tab):

Here, all students can ask questions and discuss the homework assignments. I will be actively monitoring this forum to answer questions for all students. This is the preferred way of asking questions regarding the Homework assignments. DO NOT post any code, solutions, or results for the assignments. Please keep your questions general enough as to not give away any answers. I will be moderating the site to make sure that this does not happen.

If you have questions regarding other aspects of the course, please contact Dr. Khoshgoftaar directly at: When sending¬† an email, please include “COT 6930” in the subject so that we can give it the necessary time and attention.

Due dates & Submission

Due dates are not flexible. Please provide your reports on the due date. Make sure you organize yourself and start at once – the Hadoop environment is complex, and you will need to start early to ensure that you do not have problems running the experiments. Do not start two days before the due date.

All students should send one .zip file of your homework named yourname_hwk*.zip to by 11:59PM on the due date, with the subject of your email reading “COT6930 Homework *”. (NOTE: the * means the current homework number. For example, if you are handing in Homework 2, then “*” will be “2”). The zip file should contain one PDF file with the report, along with any source files as specified by the particular homework assignment. Please include screenshots and console outputs in the PDF, not as separate files.


In addition to using various tools with Hadoop, this course involves analyzing experimental results. When analysis is required, please follow these guidelines:

  • The reports should include the detailed results of your experiments. Make sure you present your results in a synthetic way (not just a printout of the results).
  • Experimental work presented without any analysis is useless. Please analyze your results, and draw meaningful conclusions.
  • There is no exact template for the reports, but you have to organize them in a way that makes sense.
  • Do not forget to provide the methodology you used for your experiments.
  • One can look at the references to get an idea of the way researchers present, summarize, and analyze the results.


The grading for the project will be based on your reports, so make sure you not only work hard, but also document your work well.  Following are few important areas that will be considered while grading your report:

  • Well performed experiments following a good methodology
  • Good presentation of the experiments (use tables)
  • Synthetic analysis and comparison of the results of different models
  • Conclusions

A printout of the results without any explanation, comparison or conclusion is not considered sufficient.

Homework Assignments

To complete assignments 1-3, you will need to install the course virtual machine.

Slides & References

Additional Materials

The course lectures and assignments will teach you the necessary elements to succeed in this course, but if you would like to dive deeper into specific topics, please find the following materials freely available online. Online course websites such as Udacity and edX offer paid versions of courses, but you can access all course materials for free by signing up for an account.