Topics
- ❑ Introduction: Computer Science and Media Computation
- ❑ Computer Science is about specifying process (recipes)
- ❑ Why should you care about process?
- ❑ The Media Argument: If you ever want to say something that Adobe and Microsoft won't let you, you need to know something about programming
- ❑ All media are going digital
- ❑ Digital media are manipulated via software
- ❑ Programming (the creation of software) is a communications skill
- ❑ Many fields are about exact specification of process
- ❑ Business, science
- ❑ Specification of process involves lots of aspects
- ❑ What do you name things?
- ❑ What are your units? How do you describe the things your process is working with?
- ❑ How do you specify what to do and when?
- ❑ How do you do this without driving yourself crazy writing down tedious detail over and over again?
- ❑ How can you do this efficiently?
- ❑ KEY IDEA: Encoding
- ❑ Basically, computers only understand numbers
- ❑ Sequences of numbers from 0 to 255, to be exact
- ❑ But we can create standard definitions and agreements on how to interpret those numbers
- ❑ When you save an "A" in a file, your word processor actually stores the number 65. It agrees to INTERPRET that 65 as an A. (DEMO)
- ❑ Look at an A on the screen. It's actually a series of lit and unlit dots on the screen. We can represent a "graphical" A as a series of binary numbers that correspond to the dots on the screen.
- ❑ We can encode more complicated things by relying upon increasingly sophisticated encodings
- ❑ Complicated things like sounds, pictures, and movies (DEMO)
- ❑ KEY IDEA: Naming
- ❑ Anything the computer knows about, we can associate with a name
- ❑ Its our own encoding -- we use it to establish our own conventions for how we want to think of things
- ❑ We can associate an encoding, like a sound, with a name, then manipulate the name like it's the original thing. (DEMO)
- ❑ Functions
- ❑ We've already been asking the computer to do things for us
- ❑ Print
- ❑ Making and playing sounds
- ❑ We can think of all of these doing things as functions
- ❑ Like the math idea: There's some input, something happens, and there's an output
- ❑ With computers, we don't always care about the output.
- ❑ Sometimes we care about what happens IN the box, like with Print or playing a sound
- ❑ One of the things that a function can do for us in re-encode something or unencode it
- ❑ There are functions that move between different representations of basic data like letters and numbers (DEMO)
- ❑ But more interestingly, there are functions that allow us to take apart sounds and pictures into their component encodings
- ❑ What does that mean? To get there, we need to know a little about the standards and encodings for sounds and pictures.
- ❑ Some examples: Playing with the samples in sounds, playing with the pixels in pictures (DEMO)
- ❑ Arrays and loops
- ❑ We talked about encodings and layers of encodings.
- ❑ The simplest layer of an encoding is going from one number, to a series of numbers.
- ❑ KEY CS IDEA: We'll call that an array
- ❑ We can create such series easily, [1 2 3], range(), etc. (DEMO)
- ❑ More importantly, sounds and pictures can be understood as an array of encodings (numbers), and we can manipulate these.
- ❑ How do we manipulate arrays? Typically with a loop
- ❑ We can do something to each element of an array with FOR X in ARRAY
- ❑ Example: Let's do something to each "sample" of a sound (DEMO)
- ❑ Multiply it by 2
- ❑ Multiply it by 0.5
- ❑ Sound
- ❑ Let's talk a bit about sound, using the MediaTools for examples, so that we can figure out what we're doing with it.
- ❑ Sound is a wave
- ❑ Physically, it's molecules moving around in the air -- hitting one another, and then getting bounced away. That creates pressure and less-pressure.
- ❑ We can see in an oscilloscope the increasing and decreasing pressure (DEMO)
- ❑ Increasing pressure is positive
- ❑ Decreasing pressure is negative
- ❑ Zero pressure is, well, zero
- ❑ We can associate numbers with the amount of pressure (actually, it's the voltage coming from the microphone) (DEMO)
- ❑ We can see that LOUDER sounds create a greater variation in the pressure: Difference from top to bottom gets large
- ❑ softer sounds stay closer to zero
- ❑ Each of those numbers is called a sample
- ❑ Recording sound well takes LOTS of samples
- ❑ We can hear (most of us -- less so for us older folk) between 7 Khz and 22 Khz
- ❑ That's 7000 ups-and-downs (one cycle) per second, and 22,000 cycles per second
- ❑ There's a mathematical result that says that if you want to capture everything that occurs in a sound, you have to capture at TWICE the frequency of the highest sound you want to capture
- ❑ 2 * 22 Khz = 44 Khz
- ❑ This means that we have to capture 44,000 samples (numbers) PER SECOND to get everything in a recording that we might hear.
- ❑ That's the rate at which a CD is recorded. 44,000 numbers per second. In an array (a sequence)
- ❑ What were we doing when we multiplied or divided the samples? Increased or decreased the range => Increased or decreased the volume (DEMO)
- ❑ Developing a mental model of the program: Debugging
- ❑ Computers only do what you tell them to do
- ❑ Computers only do what you tell them to do
- ❑ Computers only do what you tell them to do
- ❑ The tricky part is figuring out exactly what you told them to do
- ❑ There are two things that we're concerned with
- ❑ The "Flow of execution" -- what happens first, then second, then whatever
- ❑ The "Flow of data" -- what variables got what when
- ❑ Both parts can be studied by adding Print statements
- ❑ Print statements can be signposts telling you where the code is at and when (DEMO)
- ❑ Is your code running slowly? Where IS your execution? What's the computer doing NOW? A well-placed Print statement can show you that
- ❑ Print statements can also show you what the (invisible) variables values are
- ❑ So far, we've only been dealing with linear flows of execution and iterative flows of execution (looping). Conditional is still to come, and then it gets trickier to track the flow
- ❑ Using show_vars for debugging trickier situations (DEMO)
- ❑ Sound, Part 2
- ❑ We can use loops for more than just walking all the values of an array: We can also use them for generating values
- ❑ Think about what happens when a sound is played
- ❑ The samples are sent to the speaker, one at a time, at the same rate as they were recorded
- ❑ Consider what would happen if we skipped every other sample when we played it back.
- ❑ We'd double the frequency and half the time (DEMO)
- ❑ What about if we skipped 0.5 samples each time (i.e., sent a sample twice)
- ❑ We'd half the frequency and double the time (DEMO)
- ❑ Can we play with the frequency by changing the playback rate? Can we try this for a range of numbers?
- ❑ Sure, but isn't it tedious to type in all these examples?
- ❑ For x in [0.1 0.2 0.5 1.0 1.5 2.0], play sound at freq x (DEMO)
- ❑ What happens if we add samples? We can create REVERB! (DEMO)
- ❑ We'll need two sounds, and we'll add from one to the other
- ❑ Now we'll need TWO loops
- ❑ One will track where we are in the source sample
- ❑ The other one will handle the fading out of the sound over time
- ❑ Images/Pictures
- ❑ KEY CS IDEA: A linear sequence of values is ONE way to think about data. Another common way is with a table
- ❑ Examples of tables: From newspapers, from textbooks, from lots of places
- ❑ Some recipes need more than a series
- ❑ A picture is actually a table of pixels
- ❑ Remember a sound was an array of samples, where each sample was just a number
- ❑ A picture is a table (not just a sequence) of a more sophisticated encoding.
- ❑ Not one number, but three.
- ❑ A value for the amount of redness (0 to 255)
- ❑ A value for the amount of greenness, of blueness, and of "transparency" (called alpha), all 0 to 255
- ❑ KEY CS IDEA: We can use representations and encodings within one another
- ❑ This is how it really works. Go put a magnifying class on a TV, or a monitor, or even an LCD. The screen is made up of "dots" (picture elements => pixels), and each dot has smaller dots corresponding to different colors.
- ❑ We can load pictures, see their pixel values, set their pixel values (DEMO)
- ❑ Manipulating images: Changing ARGB values, Filtering, functions
- ❑ Pictures are made of pixels that we can change
- ❑ We can walk through an image and change all the reds to less red, or more red (DEMO)
- ❑ Same for green, blue, or alpha (DEMO)
- ❑ KEY CS IDEA We can make a function and apply it to a BUNCH of data
- ❑ Make up a function to do some kind of filtering (changing of pixel values) and use a loop to apply the function to each pixel in a picture
- ❑ Filtering Part 2: Using more sophisticated functions. Conditionals
- ❑ By just increasing/decreasing pixel values, we're doing simple PhotoShop-style filtering, but that's pretty simplistic.
- ❑ It's called a linear function
- ❑ Often, you want to make choices about pixels and treat them differently
- ❑ For example: Everything that's mostly red (say, over 200), decrease the red. If the redness is less than 200, leave it alone
- ❑ This is a form of thresholding: There's a threshold value that determines what you do.
- ❑ KEY CS IDEA Computers don't have to do just one thing after another, nor just loop. They can also make choices
- ❑ But real limited choices: They only understand number, they only understand numbers, and they only understand numbers
- ❑ We can use an If-Then which is a Test.
- ❑ IF this is true, Then the computer must, Must, MUST do the Then
- ❑ No choices about it. "Test this, computer. And if it's true, DO IT!"
- ❑ Change the redness, if needed (DEMO)
- ❑ Creating more kinds of threshold functions (DEMO)
- ❑ Working with a portion of an image: Masks and looping
- ❑ Oftentimes you don't want to do something to an entire image. Instead, you want to apply some filter to just part of an image
- ❑ For example, turn the lady's hat red; blur only the one character; blur everything BUT the one character; make the box eerily-transparent
- ❑ How do we do that? Easiest way: change the limits on your loops
- ❑ Don't go from 1 to the width and 1 to the height.
- ❑ Instead, just go from 10 to 250 and 5 to 100 (for example) DEMO
- ❑ But that approach can only handle a rectangle. Most areas you want to manipulate are more complicated
- ❑ The answer: Compute a mask
- ❑ This is an unusual thing. It's a "picture" (of a sort) with data in it that does NOT correspond to pixels. Instead we just store 0's and 1's in it.
- ❑ KEY CS IDEA: Creating some data just to make the "recipe" easier
- ❑ 1's mean "This is part of the picture that we want to process"
- ❑ 0's mean "This is part of the picture that we do NOT want to process"
- ❑ You end up going through all the pixels TWICE
- ❑ First time, with a conditional, to decide whether you want to process that part of the picture. If so, put a 1 in the corresponding parts of the mask, and a 0 in all the others
- ❑ Now, go through all the pixels, and where ever there's a 1, apply the filter/function. Otherwise, don't.
- ❑ Drawing on an image: Adding lines, circles, text, and other elements
- ❑ Developing a mental model of the program: Debugging conditionals
- ❑ How do you figure out what happened where? Especially with sequential, iterative, and conditional computation?
- ❑ KEY CS IDEA: Play computer! Trace the program and do as it would do
- ❑ Print statements to find out where it went and what the data values are is KEY
- ❑ Important to look at the input: What's really going on?
- ❑ In real software development, you work at figuring out the input that breaks a program
- ❑ One way of finding these is to try the boundary conditions: What happens at 0? At 255?
- ❑ Then you make it work for that input too
- ❑ Files
- ❑ Media are mostly stored in files today. If you want to manipulate lots of media, you need to manipulate files.
- ❑ There are more ways of encoding data than just arrays and tables
- ❑ KEY CS IDEA Frequently, you want a hierarchy, which is represented as a tree
- ❑ Think about outlines, about the structure of dictionaries and encylopediae
- ❑ You've seen this already in file structures. Directories are nodes in a tree. Files in the directory (a sequence within the node) are peers or children of the directory
- ❑ We can use the directory tree to find things, move things around
- ❑ Remember how pixels are inside pictures -- a sophisticated encoding inside of another encoding? Files are similar
- ❑ Files have contents, creation dates, names -- all inside of a linear list inside a directory, which is part of a tree
- ❑ It's data representations and encodings all the way up and down!
- ❑ We can write programs to manipulate all the sounds or images in a directory (DEMO)
- ❑ We use loops to walk the sequence in a directory
- ❑ Writing Utility Functions: Moving/manipulating your files
- ❑ We can use the other ideas we've introduced, like conditionals
- ❑ For example, process only the sounds whose names end in ".wav" (DEMO)
- ❑ We'll have to do a little string manipulation here
- ❑ A string is a linear encoding of characters
- ❑ Or, process only the sounds whose modification date is today (DEMO)
- ❑ Dates are another encoding on numbers
- ❑ We can also copy files from one place to another
- ❑ But how we do it depends on the encoding in the file
- ❑ You can always read and write the numbers. This is called a binary copying (for reasons we'll see later)
- ❑ We could also read and write the letters, the strings. This is assuming the data is text
- ❑ We can use a loop to get all data across
- ❑ KEY CS IDEA: But a loop that doesn't count. Instead, it tests if we're at the end of the file. A WHILE
- ❑ Video: A series of pictures/frames
- ❑ A Video is a series of images called frames
- ❑ We can process them JUST like we did images, but we have to do it for every frame
- ❑ Easiest way: Convert your video to frames, manipulate the frames, reassemble them into a movie
- ❑ Demonstration with MediaTools (DEMO)
- ❑ Now, apply some processing to those frames, by combining what we know about processing files with what we know about manipulating images (DEMO)
- ❑ Filtering a range of pictures
- ❑ What if we want to do something to just SOME of the frames? Same techniques as processing just SOME of the pixels
- ❑ Insert a blue balloon in SOME of the frames of a picture (DEMO)
- ❑ We can also do masking
- ❑ How the weatherman works! Background subtraction and "bluescreening" (DEMO)
- ❑ "WHY IS THIS TAKING SO LONG?!?"
- ❑ Why is video processing taking so long? Is it just the speed of my processor?
- ❑ KEY CS IDEA: The "order" (Big Oh) of an algorithm
- ❑ The more loops you have, the more basic operations you do
- ❑ Array processing is order n, O(n)
- ❑ Table/matrix processing is O(n*m)
- ❑ If you have f frames in a video, that's O(n*m*f)
- ❑ What if you compute a mask for each frame THEN process it? O(2*n*m*f)
- ❑ KEY CS IDEA: Moore's Law: Every 18 months, the processor speed doubles for the same cost
- ❑ But that only cuts the time cost in HALF
- ❑ That's great, but it doesn't make processing a movie like processing an array
- ❑ How do we make things faster? Fixing how we specify the recipe/process
- ❑ An example: Finding things in an array
- ❑ Just searching one-after-the-other: O(n)
- ❑ What if they're in a particular order? We can use a binary search O(log n)
- ❑ Example: How to search a dictionary or phone book efficiently (DEMO)
- ❑ How do we get things in order? Sorting.
- ❑ Worst we can do O(n*n)
- ❑ Compare everything to everything
- ❑ Best we can do is probably O(n log n) (Not planning to give more than an intuitive sense here)
- ❑ Guess what: Some things CAN'T be made faster!
- ❑ Imagine an optimal arrangement of sounds in a composition/synthesis. You have to check EVERY combination.
- ❑ Let's say that you have 60 sounds you want to arrange and any order is possible, but you want to figure out the best one
- ❑ Basically, if you have to try every combination of n things, there are 2 ^ n combinations (can demo this pretty easily)
- ❑ O(2 ^ 60) = 11,52,921,504,606,846,976
- ❑ Imagine that you have a 1 Ghz computer (1 billion basic operations per second) -- a top of the line processor today
- ❑ It'll take you 1152921504.606847 seconds to optimize that data
- ❑ That's 19,215,358.41011412 minutes
- ❑ That's 800,639.933754755 days
- ❑ That's 2,193 years
- ❑ With Moore's law, in two years, you can do that in only 1,000 years!
- ❑ And 60 sounds is a SMALL amount -- most songs have many more notes than that
- ❑ Can we do better? Maybe -- can you be satisfied with less than perfect? Can we be smarter than checking EVERY combination? THAT'S PART OF WHAT COMPUTER SCIENTISTS DO!
- ❑ Text as a Media Type
- ❑ KEY CS IDEA: Text itself is a media type that computers are good at manipulating
- ❑ We've already been doing some string manipulation.
- ❑ Imagine that you've got a file with RGB values specified as TEXT. Can we turn that into a picture? DEMO
- ❑ Can we go from a picture and generate those RGB values? DEMO
- ❑ KEY CS IDEA: This is interpretation -- we're moving between encodings, and using an encoding to tell the computer what to do.
- ❑ If text is language, we can process it, still, but it's less well-formed so harder to process
- ❑ A demo or two goes here
- ❑ Making other programs do the work
- ❑ Writing fast, smart algorithms is a lot of work.
- ❑ Let other people do it!
- ❑ We can use what they produce!
- ❑ CS CONCEPT: We can write programs to control other programs
- ❑ Sometimes called scripting
- ❑ Not all programs are scriptable! Photoshop is not. GIMP, a free, open-source, cross-platform version of PhotoShop is
- ❑ Multiple ways of scripting
- ❑ Sometimes, can just call functions within your own language, and inter-application communication handles it
- ❑ Sometimes, you write a program that creates files that the other program understands
- ❑ Command files
- ❑ Data files
- ❑ Demonstrate both techniques for controlling GIMP
- ❑ Graphing data
- ❑ What data/information visualizers do: Take some data, turn it into a picture
- ❑ We can do that ourselves, using what we know about graphics, reading files, and interpreting text
- ❑ Building a simple scatterplot, say with stock quotes or other data from the Web DEMO
- ❑ But building graphs is hard -- let's make somebody else do it!
- ❑ We can get data ready for Excel
- ❑ Demo creating a tab-delimited file of data
- ❑ Or we can use a sophisticated plotting package
- ❑ Graphing data with GNUPlot
- ❑ Use what we know about using other programs and make GNUPlot do it!
- ❑ Get data ready for GNUPlot
- ❑ "Can't we do this any easier?": Functional Decomposition
- ❑ Writing lines and lines and lines of code is tedious and error prone. How do we make it easier?
- ❑ Wherever possible, create a function to do it!
- ❑ Remember how we created a function with our filter to apply to our images?
- ❑ We can use this idea more generally
- ❑ "Can't we do this any easier?": Recursion
- ❑ Some problems are more easily solved by doing something other than straight sequential, iterative, and conditional computation
- ❑ KEY CS IDEA: Recursion is functional programming to the max! A function calling itself!
- ❑ Use all the Brian Harvey ways of thinking about recursion here
- ❑ "Can't we do this any easier?": Functional programming
- ❑ We can take functions further
- ❑ Remember when we said in Week 1 that ANYTHING can have a name? So can functions!
- ❑ Remember how functions took inputs? Functions can also be inputs to other functions!
- ❑ We can apply function to data, rather than pass data to functions
- ❑ Reduce, Map, Apply, Filter/Select
- ❑ "Can't we do this any easier?": Objects
- ❑ Pixels and files, and pictures and directories, are more than just sequences of numbers
- ❑ How do we represent these kinds of related encodings?
- ❑ Functions that manipulate these complex encodings probably differ depending on the data we're manipulating
- ❑ Sometimes we may want to re-use the same names
- ❑ Sometimes we don't even want to use functions -- instead, we want to touch the numbers (like red value) directly!
- ❑ KEY CS IDEA: An Object encapsulates (combines) data and behavior into one encoding that we can manipulate more easily
- ❑ Pixels as objects
- ❑ Files as objects
- ❑ How we make objects (DEMO)
- ❑ Re-visiting media manipulation as functional and object-oriented programming
- ❑ Re-do some of the earlier examples, but in terms of functions
- ❑ Re-do some of the earlier examples, but in terms of objects
- ❑ Languages and Representations for Recipes: It's much of what computer scientists do
- ❑ Jython isn't the only language that computer scientists use.
- ❑ Other languages allow us to represent data or process (recipes) better for some kinds of problems
- ❑ Examples: No more than 2-3 screens each, just to give a flavor, the "sound" of the language
- ❑ Lists and functions and functional programming in Lisp
- ❑ Manipulating pixels and sounds in Smalltalk
- ❑ Manipulating arrays and matrices in MATLAB
- ❑ Much of the world uses Java these days: Let's talk about what Java looks like
- ❑ Introduction to Java
- ❑ Using Java; Code goes in files, we "compile" the files, we execute the "object" code
- ❑ All of Java is objects
- ❑ How classes are defined in Java
- ❑ All data in Java is "typed"
- ❑ Tell it the encoding!
- ❑ Walking a Java file -- what it looks like
- ❑ Introduction to Java Media Manipulation
- ❑ We can call functions (called methods) in Java to do all that we did before
- ❑ Picture and Image manipulation in Java EXAMPLES
- ❑ Spare slot and Final Exam Review