CSC 541 Syllabus

Class Home Page

http://dbgroup.ncsu.edu/?page_id=2016

Logistics, staff, and contacts

The on-campus section of the class meets in 2213 EB-III on Mondays and Wednesdays between 2:20-3:35pm; the videos of the lectures will typically be available to Section 601 students for viewing on the day of the lecture.

Instructor: Rada Chirkova

Instructor’s email policy: The instructor’s experience is that almost all course-related questions or requests that you would like to send the instructor by email are best answered in an in-person discussion. Hence the instructor holds most course-related discussions with the students in person (or over the phone) in her office during her office hours. If the instructor has a break during her office hours between meetings with students, then she will try to respond to your emails at that time. Note that a timely response by email is not guaranteed due to the reasons discussed above.

Instructor’s email: chirkova AT csc.ncsu.edu . Please be sure to read the instructor’s email policy, see above.

Phone number for instructor’s office hours: 919-513-3506

Instructor’s office hours (in EBII 2276; note exceptions):

  • For all on-campus students: Wednesdays 3:45-4:30pm.
  • For Section 601 students: Mondays 3:45-4:30pm Eastern Time. During the office hours, please see the instructor in person (if you’re on campus) in EBII-2276, or call her at 919-513-3506. During her office hours, the instructor will also be able to respond to your emails, which you can send to chirkova AT csc.ncsu.edu .

Teaching Assistants:

  • Keerthana Boloor: project coordinator
    Email: kboloor AT ncsu.edu
    Office: EBII-3044
    Office hours (note exceptions): Tuesdays 5-6pm Eastern Time
    Phone number: 919-513-7324 (only during office hours)
  • Seokyong Hong
    Email: shong3@ncsu.edu
    Office: EBII-1235
    Office hours (note exceptions): Fridays 3-4:30pm Eastern Time, for Section 601 students only
    Phone number: 919-513-7318 (only during office hours)
  • Jitendra Harlalka
    Email: jkharlal@ncsu.edu
    Office: EBII-1235
    Office hours (note exceptions): Thursdays 3:30-5pm, for Section 001 students only
    Phone number: 919-513-7318 (only during office hours)

Note: The course staff (the instructor and the TAs) are not giving lectures or holding office hours during University breaks or official University holidays.

Emails and postings, courtesy of Dr. Munindar Singh:

  1. Please direct your questions as follows. Please don’t use the message board for questions where you want the TA or the instructor to take specific action. Also don’t use the message board for questions about how your homework, program, or exam was graded. Use the message board for general assignment-related questions for which it is OK to share knowledge (don’t post ideas or code listings that give away part of a solution that students or teams are supposed to obtain by themselves).

    • All project-related questions that are not already answered on the course-project page: to project coordinator (Keerthana Boloor).
    • General content and exams: to the instructor and cc the TAs.
    • Homeworks and old exams: to the TA and cc the instructor.
    • Videos and notes posted by the EOL Program Office: to the EOL Program Office.
    • Problems with accounts, servers, submit, grade book, …: to the TA and cc the instructor.
    • Missing or erroneous grades: to the TA and cc the instructor.
    • Misc (teams, programming bugs, …): to the message board.
    • Questions about policies and schedules: If you find an error, please let the instructor know, but don’t send around emails requesting information that’s on the Web.
  2. In the interest of fairness, the instructor tries to post answers to questions asked privately or by email on the course mailing list.

Textbook (recommended; is available on reserves in the Textiles library):
TFOP: File Organization and Processing, Alan L. Tharp, John Wiley & Sons.

Supplemental Texts:

  • Handbook of Data Structures and Applications, Chapman & Hall/CRC Computer & Information Science Series
  • File Structures, M. J. Folk, B. Zoellick, and G. Riccardi, Addison-Wesley
  • Fundamental Algorithms, D. Knuth, Addison-Wesley
  • Searching and Sorting, D. Knuth, Addison-Wesley

Course Goals

My goals for you are to:

  • learn about how information is maintained on external storage devices, both in physical and logical contexts,
  • learn how to use order notation to describe an algorithm’s space and execution efficiency,
  • study the strengths and limitations of a number of common in-memory searching and sorting algorithms, and
  • study the strengths and limitations of a number of common external searching and sorting algorithms.

Course Overview

This is a course on advanced data structures for searching and sorting. The course will introduce a number of internal and external searching and sorting techniques. Algorithm efficiency will be discussed, both in terms of execution time and space.

The topics on the course web page represent a tentative course schedule. Please note that time frames for each topic will be confirmed in class and are subject to possible changes. Homework and project due dates and the midterm exam date will be announced in class.

Schedule of Reading Assignments

Apart from material in the recommended textbook (TFOP) and in the online notes related to individual lectures, no additional readings will be assigned. Students will be informed in class at the beginning of each week which sections of the textbook will be covered during that week’s lectures.

Homework and Test Schedule (tentative)

  • Homework 1 “In-Memory vs. Disk-Based Searching”: Written component due in hardcopy in class before the lecture on Monday September 5 (EOL students please see here for important information on assignment acceptance and submission), executables due (via submit board, see here; each submit board is set up once the related project assignment is announced) by 11:59pm Monday, September 5.
  • Project-selection deadline: (via submit board, see here) 11:59pm Wednesday, August 31.
  • Project teams announced by the course staff: by Monday September 5.
  • Homework 2 “In-Memory Indexing with Availability Lists”: Due (via submit board, see here) by 11:59pm Friday, September 23.
  • Project Submission Deadline I (via submit board, see here): by 11:59pm on Friday September 30.
  • Midterm Exam (please see here for a sample exam): in class Monday, October 10. EOL students please see here for important information on the administration of the exam. One early exam will be given to students with NCSU-recognized October 10 absence reasons, provided the instructor gets from the students advance notice of at least two weeks before the planned early-exam date.
  • Homework 3 “Sorting Algorithms”: Due in hardcopy in class before the lecture on Wednesday, October 5 (EOL students please see here for important information on assignment acceptance and submission).
  • Homework 4 “External Searching”: Due (via submit board, see here) by 11:59pm Friday, October 28.
  • Project Submission Deadline II (last part; via submit board, see here): deadline extension: by 11:59pm on Monday November 21.
  • Homework 5 “External Chained Hashing”: Due (via submit board, see here) by 11:59pm Monday, November 14.
  • Final Exam: In the usual CSC 541 classroom, on Friday, December 9, starting at 1pm for Section 001 students. EOL students please see here for important information on the administration of the exam. One early exam will be given to students with NCSU-recognized December 9 absence reasons, provided the instructor gets from the students advance notice of at least two weeks before the planned early-exam date.

Gadgets in Class

All on-campus students: Please turn off all your electronic gadgets, with the exception of computers, during class. You do not have to turn off your computers, but please leave your computers with their screens down at all times during the lectures.

Grading

Grades for the course will be made up from five assignments, a course project, a midterm exam, and a final exam. You are expected to view all lectures and to read any on-line notes and programs I provide. Missed exams cannot be made up without an official university excuse. Homework should be submitted via Wolfware by 11:59PM on the due date. Late homework or project deliverables will not be accepted under any circumstances.

Final grades will be calculated as follows, using +/- grading:

Homework: 10%
Course project: 30%
Midterm Exam: 25%
Final Exam: 35%

Multiple solutions or papers submitted per assignment: If a student submits multiple solutions to the same problem in the same paper (in both the exam papers and the non-exam papers submitted for grading) and does not clearly mark exactly one of these solutions as “the correct” solution, then none of the solutions to the problem will be graded. In case a student submits multiple papers for the same assignment, the course staff will grade the latest submission only, even if the latest submission does not contain solutions to some problems, and solutions to those problems can be found in earlier submissions for that assignment. (If the multiple submissions are not timestamped by the student, then the course staff reserves the right to pick one submission out of these multiple submissions and to grade that submission only.) In case that latest submission for an assignment is a late submission (permitted only if the submission conforms to the late-submission policy), no earlier-submitted paper will be considered for grading for that assignment, even if that earlier-submitted paper is not a late submission.

The Regrade Policy, courtesy of Michael Genesereth at Stanford University (note that all regrade issues must be resolved before the end of the last day of classes):
If you feel a regrade is warranted, the following items may be of interest.

  • You must clearly and concisely describe why what you wrote is more correct than we gave you credit for.
  • The entire problem set or project report will be regraded. In other words, by requesting a regrade, you risk lowering your grade.
  • You should send your request, in writing, to the TAs and copy the instructor.
  • All regrade requests must be made no later than a week after each homework, project report, or exam is returned to the class.
  • The only exception to the second bullet point will be arithmetic errors. If we subtracted 5 from 10 and gave you 2, we won’t regrade your solutions.

Self-Study Responsibilities

Some of the topics are important but are either quite straightforward or not a main focus of this course. These topics will be identified by the instructor as self-study topics. Your knowledge of them will be evaluated as appropriate through exams, homeworks, programming assignments, or the project.

Prerequisites

You should have taken CSC 314 Data Structures, or must be willing to learn/review the material as we go. All the homeworks are to be done in C++. The project information can be found at this page.

Topics

All the information in this section is tentative as of August 2011.

  1. Physical Disk and Storage Basics
  2. Order Notation
  3. Logical File Organization
  4. Sequential and Direct File Access
  5. File Management Strategies
  6. File Indexing
  7. Internal Sorting Algorithms
  8. Internal Searching Algorithms
  9. External Sorting Algorithms
  10. External Searching Algorithms
  11. Hashing
  12. Distributed Hash Tables
  13. Signature Files
  14. (time permitting) File Systems

Academic Honesty:

The university, college, and department policies against academic dishonesty will be strictly enforced. You may obtain copies of the NCSU Code of Student Conduct from the Office of Student Conduct; please see Section 7 for a statement on academic integrity.

Unless otherwise specified, every gradable part of the course requires individual work. Where collaboration is permitted, students may discuss problems with each other, but the solutions must still reflect their individual understanding. All kinds of collusion will be subject to disciplinary action. Students must acknowledge sources such as books (other than the recommended textbook) and old assignments. Unacknowledged use of any such material is subject to disciplinary action. Any attempts to circumvent computing system security or interfere with others’ work will also be subject to disciplinary action.

Compliance will be monitored by the MOSS software, which is very good at detecting similarities in programs. MOSS has been used to successfully identify cases of copying or plagiarism in a number of CSC courses, and will be applied to all programming assignments you complete.

Mailing List:

You will be added or deleted from the course mailing list automatically based on your registration status for the course. The TAs can insert and remove additional addresses, but neither the TAs nor the instructor can remove your unity address from the mailing list.

The instructor will make each course-related announcement available on the course announcements page. Students are responsible for monitoring the relevant web page, at least twice a week, for any course-related announcements.

Exam Instructions:

The final exam is cumulative. All questions in both exams will be covered by the topics presented during the semester. Both exams are closed book. However, a one-page (i.e., one-side) crib sheet (plus the crib sheets for previous tests, if any in the same class) may be used, on no more space than letter-size paper (8.5 x 11 inches), with font size 12 or larger, or a handwritten equivalent, with at least 1-inch margin on each of the four sides of the sheet. Crib sheets may not be shared. Abuse of the crib-sheet requirement is considered cheating. Collusion or cheating of any form are forbidden. You can be asked to explain your solutions verbally. The instructor never asks any trick questions. In solving exam problems, you may not make additional assumptions, unless explicitly encouraged to do so in the problem assignment. You can get partial credit for an English description.

Project Instructions:

Planned project topics:

  • Project 1: Data Structures and Algorithms for “Lookups” in Networking Applications. In this project, the students will learn about the data structures and algorithms that are required for building networking applications. In-memory and external search and sort techniques will need to be applied on specific data structures, which are designed mainly for “lookups” as applied in large networks and data centers.
  • Project 2: SOA-based enterprise configuration analysis. In this project, the students will need to query, persist and transform XML documents. The XML documents that need to be analyzed are “real-world” enterprise configuration files. In this project, the students will explore the applicability of in-memory and external search and sort techniques to industry standard file analysis.

The project is meant to be carried out by student teams on NCSU computing facilities. The software produced should be compilable and executable on NCSU machines. If you wish to use off-campus resources, you may do so, but only with explicit prior approval. Approval usually requires that (a) there is no inconvenience to the other members of your team, (b) you are willing and able to manage without any support from NCSU staff, and (c) you are willing and able to assist the course staff in evaluating your work, e.g., by connecting to the off-campus site or by transporting equipment to campus.

Often, a project requires teamwork. Project teams ought not change over the semester. However, a change may be allowed for a good reason. Each member of the team is expected to work hard. Team members are encouraged to resolve technical differences through discussion. However, if some member is not working satisfactorily, complaints from the other members will be entertained, possibly leading to a reduction in credit for the person not working satisfactorily.

The teams may use the free Elluminate Live! tool to enable project-related meetings throughout the semester.

Attendance:

The instructor expects good attendance by the students. If you miss a class it is your responsibility to make up. If you miss an exam, please supply official documentation in order to get credit. For anticipated absences, the instructor will give one exam prior to the exam date for the rest of the class; all early-final-exam requests must be received by the instructor at least 15 days before the regular scheduled date for the exam. For unanticipated documented absences, the instructor will prorate your scores on the other exams. You should review the official attendance regulations at http://www.ncsu.edu/policies/academic_affairs/courses_undergrad/REG02.20.3.php.

Grade Assignment:

The weights of different components of the course are specified in the course description. The instructor generally assigns nominal grades based on the total score. However, she also looks at the whole record to decide if a student merits a better grade than the nominal one. Specifically, the instructor considers the score in the exams seriously in moving students to a better grade.

Students with Disabilities:

Reasonable accommodations will be made for students with verifiable disabilities. In order to take advantage of available accommodations, students must register with Disability Services for Students, phone 515-7653 (voice) or 515-8830 (TTY).

All Other Questions You Might Have
In case any issues arise that have not been discussed in the preceding sections of this syllabus, each issue will be resolved as outlined in Dr. Munindar Singh’s Course Policies. For all the issues that are discussed in both this CSC 541 syllabus and Dr. Munindar Singh’s Course Policies, the rules of this CSC 541 syllabus supersede those of Dr. Munindar Singh’s Course Policies.

Acknowledgments

Many thanks to Dr. Christopher Healey and to Dr. Alan Tharp for providing the materials for this course.