# Fall 2006

## 1. Data Representation

• How does a computer store data such as text, numbers, sound, video?
• Before computers, data was primarily stored in Analog form
• Analogous to real life, signals can vary across an infinite range of values.
• Example - Wrist watch with sweep hand is an analog device, continuous read-out of time
• Computers store data in Digital form - as a series of discrete values.
• Example - Digital watch
• Using analog to digital conversion a continuous signal (sound wave, light source) can be represented by a series of numbers, each one representing a brief sample of the signal. The numbers can then be stored as binary data.

• An approximation to the original signal can then be created from the stored data.

Here is a diagram illustrating the process of digital to analog conversion
• Analog Storage
• Advantage: Continuous Sampling: Records a continuous stream of information about the signal, we get the most information possible about the signal
• Disadvantage: Reproduction Errors: If we want to reproduce or duplicate signal repeatedly, then we lose precision (a tape of a tape of a tape is very poor in quality)
• Digital Storage
• Advantage: Lossless Reproduction: No matter how often we reproduce data of signal, it is always the same
• Disadvantage: Sampling Errors: If we sample at too low a rate, we will lose important signal information

## 2. Binary Numbers

• All data stored in a computer is stored in binary format -- a series of 1's and 0's
• Recall: The decimal system uses the digits 0-9 to represent a number, and uses a "place holder" system
• For example, the number 145 stands for the number whose value is 1*100 + 4*10 +5*1, that is, if we label columns of the number, we get
`          H | T | O          1 | 4 | 5`
where"H" is the hundreds column (or place), "T" is the tens column, and "O" is the ones column.
• Alternatively, we see that the ones column can be interpreted as 10^0, the tens column as 10^1, and the hundreds column as 10^2, such that the number can be written as
`        10^2|10^1|10^0          1 |  4 |  5 `
so 145 is really {(1*10^2)+(4*10^1)+(5*10^0)}.
• The binary system works under the same principles as the decimal system, only it operates in powers of 2 rather than powers of 10.

`            10^2|10^1|10^0`
for a number we have
`             2^2|2^1|2^0`
• Instead of using the digits 0-9, we only use 0-1
• For example, the binary number 1011 is actually the decimal number 11, since the number represents  {(1*2^3) + (0*2^2) + (1*2^1)+(1*2^0)} = 8 + 2 + 1 = 11.
• What decimal number is represented by  101101 ?

## 3. Storing Integers

• To store different integers requires differing sizes of binary numbers
• 13 = 1011
• 116=1110100
• Modern computers use fixed-width storage for integers - typically 32 bits, (leading 0's are put in front of the binary number to fill out all of the 32 bits)
• 13 =  00000000000000000000000000001011
• 116= 00000000000000000000000001110100
• Thus, 2^32 = 4,294,967,296 bit patterns are possible. To store positive and negative numbers, we reserve about half of the patterns for positive numbers and about half for negatives.

• Negative numbers are signified by setting the left-most bit to be 1.  The left-most bit is not actually part of the number, but is just the equivalent of a +/- sign. "0" indicates that the number is positive, "1" indicates negative.  (see Reed, page 218) This is called Two's-complement notation
• Negative number with largest absolute value (2^-31) corresponds tl the smallest negative bit pattern
•           10000000000000000000000000000000
• Negative number with smallest absolute value (-1) corresponds to the largest negative bit pattern
•           11111111111111111111111111111111

## 3. Floating Point Numbers

• The real number `1234.56` can be represented as ```1.23456e3 (1.23456 × 103)``` in scientific notation.
• In scientific notation, the numbers are normalized so only one digit appears to the left of the decimal point.
• `-990.0` is represented as ```-9.9e2 (-9.9 × 102)```
• We can think of this as the pair of integers `(-99,2)`
• Floating-point notation
• Sign-bit, Exponent, Mantissa (fractional part)
• IEEE single-precision floating-point representation (32 bits)
• Sign bit
• Exponent (eight bits)
• Mantissa (remaining 23 bits)
• Range of numbers: `-1.75 × 1038` to `3.40 × 1038` (approximately)
• Can get roughly 7 significant digits of precision

## 4. Strings

• Strings are stored as sequences of characters.
• A character (like "a", "#", etc) is stored as a binary number in a standard code, called the ASCII (American Standard Code for Information Interchange) code.
• Overhead of Reed, page 221
Data Representation Page

## 5. Images

• Bitmaps
• A bitmap is a collection of dots, each of which is called a pixel (short for picture element).
• A black-and-white pixel can be represented as a single bit.
• A colored pixel is represented by a 24-bit code of three bytes giving the intensity of red, green, and blue (RGB value).
• For example, the HTML color `brown` is represented by (165,42,42).
• Image formats are usually compressed bitmaps
• Common Web image formats
• Graphics Interchange Format (GIF) - a lossless format
• Joint Photographic Experts Group (JPEG) - a lossy format
• Motion Pictures Experts Group (MPEG) for motion video
• Here is an example of a GIF image  - 24 bits to represent each pixel
• Here is an example of a simpler gray-scale image - 8 bits for each pixel

## 6. Sounds

• A sound wave can be represented as a waveform whose amplitude vaies with time.
• With digital sampling, the waveform can be represented by a sequence of discrete values, measured at regular intervals and then stored as as sequence of numbers.
• The frequency of measurement determines the fidelity of the reproduction.
• 8,000 samples/sec for long-distance telephony.
• 44,100 samples/sec for CD-quality sound.
• Some widely used encoding schemes:
• Musical Instrument Digital Interface (MIDI) in music synthesizers encodes what instrument is to play which note for what length of time
• MP3 (MPEG1 - Layer3) is the audio compression format used in MPEG1