Syllabus
JOUR 72312 Data Visualization
Spring 2014 CHUN
Fridays, 9:30AM-12:20PM
Room 436
Public Class notes:
http://bit.ly/datavizspring2015
It isn’t hyperbole: journalists today have access to more data than ever before, as well as better tools to understand that data and expose the stories buried in the numbers. From election results, budgets and census reports, to Facebook updates and image uploads, journalists need to know how to find stories in data and shape them in compelling ways. This hands-on course teaches you to gather and analyze data, and visualize interactive data-driven stories. This burgeoning discipline touches on information and interactivity design, mapping, graphing, data analysis, and a bit of code. Participants are expected to pitch, report, and produce stories working alone and in teams. You’ll learn to use spreadsheets and online Web tools such as CartoDB, Refine, and HighCharts, and integrate them in a non code-intensive development environment. Familiarity with HTML/CSS is helpful, but not required. This is not a course in programming, but we will be dealing with some code.
Course objectives
This three-credit course explores complex storytelling using data. Students will pitch, report, conceptualize, design, and produce informative and compelling data-driven pieces. The course emphasizes:
- Data collection
- Editing and organizing data while maintaining its integrity
- Basic statistical methods and concepts, the foundation of solid data reporting
- Understanding technologies available to create online, interactive data-driven stories
- Design basics, effective visual communication, and data visualization
- Applying interactivity to data-driven stories
- Critical evaluation of professional data-driven news stories (what makes a particular project successful?)
- Seeking out innovative uses of data
- Understanding the development process for creating data stories
Course outcomes
- At the end of this course, students will be able to:
- Identify patterns in data that help uncover news trends
- Conceptualize clear and concise ways to illustrate these trends
- Create interactive graphics using both custom tools and web-based services
- Evaluate effectiveness of data-based storytelling projects, both of their own creation and across the industry.
- Instruct and supervise fellow journalists and programmers in identifying and producing stories that can become effective data stories.
Faculty
Russell Chun Office: 432 russell.chun@journalism.cuny.edu
Russell Chun is a multimedia developer, author, and educator specializing in visualizing science, data, and story ideas for the web. He is on the adjunct faculty at City University of New York (CUNY) Graduate School of Journalism where he teaches data-driven interactive journalism. He is also Department Head of the Illustration Program at Sessions College for Professional Design. He is the author of several books on multimedia, and has developed courses, and interactive and video products on effective multimedia. Russell previously taught at Columbia University and the University of California at Berkeley Graduate Schools of Journalism. He’s served as an interactive consultant and trainer for News21, a Carnegie/ Knight-funded national initiative to improve the quality of journalism education in the United States. He’s judged local and national multimedia news contests.
WordPress and Digital Storage
Your major stories will be uploaded to CUNY’s Digital Storage server. Note that this web hosting will be available to you for two years after you graduate, so you should make plans to backup work you are proud of and find hosting for it off of campus servers. Some assignments will be posted to a class blog. Students will be required to present their stories in class for critique. Posts to the class blog are public by default, but you can choose to keep them private if you prefer. Students are encouraged to submit superior and/or timely work for publication elsewhere, including school outlets such as the New York City News Service.
Software Requirements
- Tabula allows you to extract structured data from PDFs
- Open Refine is indispensable for cleaning messy data
- An FTP client like Fetch or Cyberduck to transfer files to the server.
- A text editor like TextWrangler.
- Microsoft Excel
- Google Chrome as your Web browser
- Google Spreadsheets and Fusion Tables, which are available as apps from your Google Drive.
- You will also be asked to create accounts on JS Fiddle, and CartoDB. Important: before you create your CartoDB account, make sure you have the information you need to get the free academic version.
Readings
Readings are available on e-reserve from the Research Center. We will be reading from:
- Cairo, The Functional Art
- Illinsky and Steele, Designing Data Visualizations
- Tufte, Quantitative Display of Information
Grading
Your grade is determined by three factors: participation, successful completion of all solo homework assignments, and successful completion of the two team stories and the one solo story. Your participation includes attending all classes, being active in discussions, workshops and critiques, presenting your story for the Data Festival, and participating in all in-class hands-on activities. Your assignments will be evaluated in terms of use of data, story and context, interactivity, and design.
Participation : 20%
Homework assignments: 20%
Story 1 (team): 20%
Story 2 (solo): 20%
Story 3 (team): 20%
Grades for your two team stories are further broken down as follows:
Pitch (25%)
Storyboard (12.5%)
Draft (25%)
Final (25%)
Revision (12.5%)
Grades for your individual stories are further broken down as follows:
Pitch (20%)
Final (40%)
Revision (40%)
This means that if you complete a brilliant story but don’t put real effort into the initial pitch or rough draft, you can’t get better than a C on the story. All assignments are due at the beginning of class. E-mail all assignments directly to the professor with “Homework Week X” in the subject line, where X is number of the week. If we can’t find your homework because an incorrect subject line, you won’t get credit for it. What do I mean when we say “Pitch” or “Rough draft”? This is what I mean:
Pitches: A complete pitch should tell us who cares, why we care now, and what pre-reporting you’ve done. You must include:
- a proposed title or headline
- a news hook, or explanation of why this story matters now. Why should I care?
- your nut that captures the essence of the story. 1-2 sentences only.
- a description of and link to the data (which means you must already have your data)
- one source you have already spoken with or at least three potential expert sources and your plans for reaching them
Storyboards: A storyboard organizes your content conceptually and spatially. This semester, when you turn in storyboards, you should also include a revised pitch. We use wireframe and storyboards interchangeably here. We’re looking for a simple sketch (on paper, in Word, or PowerPoint, Illustrator, or any number of online storyboarding tools) that shows us how you intend to integrate your visualizations, words, and navigation elements. Use simple boxes to tell us where your different elements will be positioned in a design, and how a user will navigate through the content. Check out Mark Luckie’s thoughts on sketching/storyboarding, with examples, from 10,000 Words.
Rough Drafts: A rough draft does not have the polish of a final project, but it should be close. You should have created all the visualizations that you plan to use. Your classmates should be able to evaluate a rough draft on its merits, without a guided tour of forthcoming features. A complete rough draft includes:
- Clean data in spreadsheets, already normalized, sorted, manipulated
- Visualizations of the data with proper labels, keys, and/or legends
- Captions
- Credits
- A headline
- At least three hyperlinks to other reporting, integrated in the body of your text that puts your story in a broader context.
- Text that incorporates reporting from at least one human source. You’re not required to quote your source, but you do need to be able to tell the class what insights your human source provided.
Final Story: Your story must be posted to the class blog. The blog will be kept private to keep rough drafts private. If you wish to host your final story elsewhere, you may, but you still need to post a headline, excerpt, image and link to the class blog. Plagiarism It is a serious ethical violation to take any material created by another person and represent it as your own original work. Any such plagiarism will result in serious disciplinary action, possibly including dismissal from the CUNY J-School. Plagiarism may involve copying text from a book or magazine without attributing the source, or lifting words, code, photographs, videos, or other materials from the Internet and attempting to pass them off as your own. Please ask the instructor if you have any questions about how to distinguish between acceptable research and plagiarism. Copyright In addition to being a serious academic issue, copyright is a serious legal issue. Never “lift” or “borrow” or “appropriate” or “repurpose” graphics, audio, or code without both permission and attribution. This applies to scripts, audio, video clips, programs, photos, drawings, and other images, and it includes images found online and in books. Create your own graphics, seek out images that are in the public domain or shared via a creative commons license that allows derivative works, or use images from the AP Photo Bank or which the school has obtained licensing. If you’re repurposing code, be sure to keep the original licensing intact. If you’re not sure how to credit code, ask. The exception to this rule is fair use: if your story is about the image itself, it is often acceptable to reproduce the image. If you want to better understand fair use, the Citizen Media Law Project is an excellent resource. http://www.citmedialaw.org/legal-guide As with plagiarism, when in doubt: ask. Deadlines Deadlines on assignments – as in any newsroom – are sacrosanct and should not be missed without exceptionally good reason, and only when your instructors have been notified in advance. If you are taking the course for credit, late assignments will be assessed a one-half grade penalty for every day overdue. Absences and Tardiness Participation and attendance are important ingredients to your success in the class, especially in this course where your major assignments are team-based. Please be on time for class and back to class from breaks. Repeated tardiness will result in a reduction of grade in participation. Notify the instructors of any absences before class, or as soon as you know you will be out.
SYLLABUS in BRIEF
Lecture:What you can expect from us | Homework:What we expect from you (due) |
|
|
Festival of Data: Every week one student will choose a data driven story to present in class. Prepare to discuss the strengths and weaknesses of the story, the authors’ use of data as well as their use of interactivity, and to identify the underlying technology. Blog your story in the “Festival of Data” category by the start of class time on your week.
SYLLABUS in DETAIL
Suggested readings:
- “Data Points” column at Columbia Journalism Review: http://www.cjr.org/data_points/
- Kevin Quealy’s blog, “Charts and Things”: http://chartsnthings.tumblr.com
- Source, the Knight-Mozilla blog on code and journalism http://source.mozillaopennews.org/en-US/learning/
Due on the first day of class: Watch Geoff McGhee’s Knight Fellowship Report on Data Journalism at http://datajournalism.stanford.edu/
- Chapter 2 Data Vis in Journalism
- Chapter 3 Telling “Data Stories”
- Chapter 6 Exploring Data
1 | Defining and Finding Data Course introduction (expectations, syllabus review) What is data, what are data stories? Reactions to McGhee’s data journalism video report. Data Viz Pre-test. Discussion: work in groups to evaluate recent data driven stories. Discussion: Looking for data, where to look and how to look?
HOMEWORK: Find two datasets that interest you. Tell us who maintains it, where the data can be found (the URL) and in 1-2 sentences explain why the data is interesting. Read Cairo: The Functional Art, Reading part 1: pages 25-31, 36-44, on thinking through a visualization as a tool for the reader; what graphical form best serves the goal? On e-reserve in the Library
2 | Finding the Story in Your Data Discuss homework: Problems, challenges, solutions, Discuss: provenance and staying organized Spreadsheet review: data types, rows and columns, sorting, copy and paste, selections, formulas. Review Pivot tables. Conditional formating. In-Class Exercise: Using spreadsheets and Pivot tables
HOMEWORK: Spreadsheet assignment. See Homework page for details.
3 | Cleaning Data Cleaning data and advanced spreadsheets Open Refine and common spreadsheet formulas: split, concatenate, unique, countif, sum. In-Class Exercise: working with Refine to clean data.
HOMEWORK: Clean a dataset (I will provide you with one) with Refine and tell us your findings in a nutgraf. Email your cleaned dataset and nutgraf under the subject “Homework Week 3”. See the “Homework” page for more details.
4 | Graphical Encodings/Charting Data Discuss homework Discuss anatomy of a news-chart: all the little pieces Chart types – what they’re good for, what they aren’t. Cleveland and McGill’s findings on readability of chart types Excel to Illustrator, SVG charts, Chartbuilder, Google Charts, Raw, Charted. Pitching a story: what we expect, what you’re thinking. Choose teams for your first story.
HOMEWORK:
- Team – pitches for your first story. A complete pitch should tell us who cares, why we care now, and what pre-reporting you’ve done. Review the section (above) in the syllabus on what is expected in a pitch.
- Pitches must be posted to the class blog, in the “story 1 pitches” category by Thursday morning.
- Read Chapter 4: Choose Appropriate Visual Encodings in Designing Data Visualizations by Steele and Iliinsky (in Library)
- Read Cairo: The Functional Art, Reading part 2: pages 118-129, on Cleveland & McGill’s perceptual accuracy
5 | Geographic encoding/Mapping Data Discussion: Looking at map examples In-Class Exercise: Mapping with CartoDB. Geocoding, Shapefiles, Fusing two data sets, customizing infoboxes, colors, using filters. SQL for selectors. Workshop: Pairs of teams work with each other to discuss pitches.
HOMEWORK:
- Read Cairo: The Functional Art, Reading part 3: pages 73-86, on presentation
- Team – refine your pitches and bring in storyboards for your first story. A storyboard organizes your content conceptually and spatially. This semester, when you turn in storyboards, you should also include a revised pitch. We use wireframe and storyboards interchangeably here. We’re looking for a simple sketch (on paper, in Word, or PowerPoint, Illustrator, or any number of online storyboarding tools) that shows us how you intend to integrate your visualizations, words, and (optional) navigation elements. Use simple boxes to tell us where your different elements will be positioned in a design, and how a user will navigate through the content. Scan your sketch and include it with your post.
6 | Presentation, Information design, ethics Integrating the presentation: Annotating the data, Design, and Interactivity Discussion: storyboards. Intentional use of space Discussion: Principles of design – grids, hierarchies, color, typography, white space, scale, repetition, consistency. Discussion: Ethics, avoiding distortion, responsible presentation of data What do we expect in a “rough draft”?
HOMEWORK:
-
- Team – Rough drafts of your first story: Rough Drafts: A rough draft does not have the polish of a final project, but it should be close. You should have created all the visualizations that you plan to use. Your classmates should be able to evaluate a rough draft on its merits, without a guided tour of forthcoming features. Use a Bootstrap template for your story, and post the link on the blog. A complete rough draft includes:
-
- Clean data in spreadsheets, already normalized, sorted, manipulated
- Visualizations of the data with proper labels, keys, and/or legends
- Captions
- Credits
- A headline
- At least three hyperlinks to other reporting, integrated in the body of your text that puts your story in a broader context.
- Text that incorporates reporting from at least one human source. You’re not required to quote your source, but you do need to be able to tell the class what insights your human source provided.
- Read selections from Tufte, Quantitative Display of Information, on e-reserve in the Library: pages 91-105, 176-190.
7 | Open Workshop Open workshop
HOMEWORK :
- Team – post final story on the class blog in the category “Story 1 Final”. See the rest of the checklist to ensure that your story is complete (See Checklist page)
8 | Critiques Discussion: Critique our first finished data stories Assign Teams for Second Story. Discuss solo story ideas.
HOMEWORK :
- Solo – pitch a story to be completed in one week as individuals.
- Team – Revise your first story.
9 | Advanced tools Discussion: Revisions and solo pitches In-class exercise: working with CartoDB and styling maps with CSS. HighCharts, Mr. DataConverter, and understanding different data formats. Using Odyssey.js for storytelling with maps.
HOMEWORK:
- Solo – Rough drafts due.
10 | Open Workshop
HOMEWORK:
11 | Critique and Interactive Tables
HOMEWORK :
12 | Pitches for class project
HOMEWORK :
13 | Open Workshop Open workshop
HOMEWORK :
- Team – post final story on the class blog in the category “Story 3 Final”. See the rest of the checklist to ensure that your story is complete.
14 | Critique of final story
- Fill out student evaluations