Python reading notes - Chapter 0

The Practice of Computing using PYTHON


-1st course in computer science about new way of solving problems computationally.
-intro programming language should be:
1. should be relatively simple to learn.
-intro course mostly about problem solving, so programming lang shouldn't get in way.
-powerful built-in data structures
-advanced control constructs
2. should be practical.
-students see that might use what are learning while learning it.
-supports learning fundamental prgramming issues:
-fundamental object-oriented approach.
-common data structures.
-supports learning complex computing issues like threads and regular expressions.
-used to address practical probs like:
-Web access
-database manipulation

-focus on data manipulation and analysis as a theme.
1. teach prob solving w/in context of CS1 using Py as vehicle.
2. provide <ex>'s of developing programs focusing on kinds of data analysis probs students will face.
3. Give practical foundation in programming.
-produce useful, meaningful results in their respective fields of study.

-motivate need for writing classes.
-functions split into 2 parts cuz how Py handles mutable objects like lists.

-Data Manipulation - theme.
-Prob Solving & Case Studies - emph prob solving.
-divide-and-conquer approach.
-Code Examples-
-Interactive Sessions -
-Self-Tests -
-Programming Tips - PTp's


0.1 - Why Comp Science.
1 -computers everywhere - millions upon millions.
-but, not enough of reason.
2 -more universally applicable than any other commodity in hx.
-KEY ELEMENT: - no matter what area of interest, computer good tool.
-universal utility is unique.

-computer 'science'-
-popular def of 'computer science' = computer programming.
-is way people intro'd to computers.
-computing/comp science WAY more than progrmming.
-<EX>'s of areas of computer science:
1. theory of computation -
-(from time when not millions of computers).
-scientiest thought about waht means to do computer and what limits might be.
-'is there a problem we can conceive of but can't compute?'
-"Halting Problem" - can't be solved by program on any computer. (
-knowing what can/can't solve on computer is import.
-is subject of study by comp scientiests.
-focuses on theory of computation.

2. computational efficiency -
-prob is computable, but doesn't mean is easily computed.
-knowing how diff prob is to solve also v important.
-determining meaningful measure of difficulty is an issue.
-<ex> - sorting items - 100,000
-Bubble Sort - v slow algorithm for sorting.
-would take 800 seconds (13 minutes)
-Quick Sort - faster sorting algorithm.
-would take 0.3 seconds - 2400 x's faster.

3. algorithms and data structures -
-the "currency" of the comp scientist.

-algorithms - methods used to solve probs.
-data structures - the organizations of data that algorithms use.
-these 2 concepts distinct.
-algorithm/prob solving - general approach to solving a prob.
-data structure - organization of data being processed.
-can examine ea independ of how are programmed.
-design algor's and data structres and then implement them in program.
-**understandng how to design algorithms/data structures independent of programmng lang CRITICAL for writing correct/efficient code.

3. parallel processing - using multiple computers to solve problems.
-issue for every1 today.
-cuz most computers have at least 2 CPUs and lots have 4+.
-PlayStation3 - IBM chip w/ 8 processors.
-Intel - new prototype chip w/ 80 processors.
-need new algorithms, data structures and programming paradigms to take advantage of multiple processors.

4. software engineering - process of writing programs.
-developed own sub-discipline.
-def. - concerns process of creating programs.
-designing algor used
-supporting testing/maintenance of created programs.
-off-shoot discipline that represents developed program as mathematical entity to PROVE program.
-0.2 - Difficulty/Promise of progrmming
-why doesn't every1 do it??
1. two things at once -
-<ex> - (analogy to French poetry).
-don't know French.
-don't know poetry.
-so TWO probs:
-syntax/semantics - form and substance of language.
-rules - gotta know rules of poetry -
-what makes a good poem.
-applied to programming:
-gotta learn semantics/syntax of progrmming lang.
-don't know how solve probs using computer - just like don't know how write poetry.
-gotta solve two probs @ once.
-get familiar w/ syntax/semantics.
-learn solve probs w/ computer.
-SO, the two probs are on two diff levels:
1) meaning of programming words/lang.
2) learn intent of program - what prob solving.

2. what's a good program? - guidlines for writing good programs:
-problem solving - program is detail of how you think partic prob or class of probs should be solved.
-gotta think thru prob before writing.

| Rule 1: Think before you program.
| Rule 2: Program is human-readable essay on prob solving.
-also happens to execute on computer.

-program is object to be read by another person.
-I read it - put down and come back after while.
-will write progrms as group effort - others gotta read what is there.

-progrm is essay - runs on computer - this is most unique/impressive aspects of it.
-think of way to solve problem.
-then write it down.
-prob can be solved many times cuz program used independent of me.
-computer progrm biggest leap forward since Gutenberg.

0.3 Computer Language
-program - concrete, runnable realization of person's thoughts.
-so have diff langs to better reflect thoughts.

-PYTHON - 3 import features for intro progrmming lang:
1. provides low cognitive load on student.
-easy to express prob-solving thoughts in mechanisms supplied by lang.
2. easy to apply to probs will encounter after the course.
3. broad support across many disciplines.
-should be used by lotsa people in many fields.
-useful packages/collections of support programs available to many diff types users.

-there should be one - and preferably ONLY one - obvious way to do it.
-one-to-one mapping between prob-solving need and language support for the need.
-few shortcuts
-less language to remember.

-Py as 'best practices' lang -
-Py provides many of best parts of other langs directly to user.
-important data structures provided as part of standard lang.
-iteration introduced early.
-is available on standard data structures.
-packages for files, file paths, the Web are part of standard lang.
-"batteries included" lang.
-many commonly needed aspects provided by default.

-Py is open source -
-support from various communities.
-breadth of support.
-open source - Py developed under Open Source model.
-is about software.
-is about culture/viewpoint used by developers of it.
-makes software freely available.
-guarantee free availability to those who develop new software based on the free software.
-<ex>'s - -Linux - o.s.
-Firefox - web browser
-Thunderbird - mail client
-Apache - web server
-as culture - Open Source people wanna make software available and useful as possible.
-share fruits of labor
-as group, move software forward and app areas where software used.
-"A rising tide lifts all boats."

-Py packages for:
-natural language

-no 'best' language.
-all comp progrmming langs compromise some.
-can't use own words.
-natural langs too hard to turn into precise directions.
-all langs have strengths/weaknesses.

-0.4 Computation
-computation - manipulation of data by either humans or machines.
-data may be numbers, characters or other symbols.

-0.5 What's a computer
-computer - something that does computation.
-vague re: how does computation.
-cuz computers are diverse.
-features computation systems must have:
1) has to accept input.
-diff kinds input.
-data has to enter computer for processing.
2) defined by 'does computation,' so must be able to manipulate data.
3) has to output data.

-performs computation w/ assembly of simple parts.
-huge no. simple parts - assembled in complex ways.
-organiz of parts is what makes computers complex.

-Computation in Nature - (ebook pg 13)
1. human brain - 50's-60's - computers = 'electronic brains.'
-only computing object was human brain.
-powerful engine.
-neuron - acts like small switch - signal to dendrite, passes down axon to axon terminal, synapse.
-neuron signal not transmitted by electricity.
-chemical Δs.
-neuron has to recover for millisecond.
-built-in time delay.
-fired in response to combo of the no. of signals received.
-and in response to strength of signal.
-neuron is slow switch.
-made up for by complexity and size of brain.
-100 billion (1011) neurons - 100 trillion (1014) synapses.
-branch of computer sci deals w/ neural networks.

2. evolutionary computation - evolution of bio species is computation process.
-inputs are environmental variables.
-computational process is adaptations of genetic code.
-output - adaptation results from Δ'd genetic code.
-basic parts of genetic algor:
1) way to encode solution in linear sequence.
-like info in chromosomes.
-consists of parameters required to constitute solution.
2) evaluation functionB - way to evaluate how good solution is.
-shown by f(x) in diagram.
3) population of solutions - created by random selection of solution components.
-use to create new solutions.
4) modification methods - modify 1 aspect of existing solution.
-crossover - combines aspects of 2 parent solutions.
-(see pg 16 of ebook).

-Human Computer -
-"computer" - 17th century to WWII - meant 'people.'
-used people to compute difficult, high labor values for: -mathematical constants like pi.
-fission reactions
-gun trajectory tables
-ΔΔ - Wm Shanks does pi to 707 decimal places.
-took 28-years.
-mistake @ 528th place, so last 2-years were wasted.
-ΔΔ - gun trajectories using ENIAC - 1st general-purpose electronic digital computer.
-took 30-40 hours to do one trajectory.


0-0.6 - Modern, Electronic computer
-SWITCH - base component.
-early ones used mechanical switches/relays
-then vacuum tubes.
-either on or off - electricity flows or doesn't.

-can construct simple logic circuits - True or False.
-and circuit = two switches in row - both have to be on (True) for current.
-or circuit = two parallel switches - either on on (True) or both on (True) for current.
-logic gates - Boolean logic elements.
-constructed from simple circuits.
-can assemble entire computer.
-adder - made from and and or gate.
-makes possible to add.
-can use to create subtraction circuit.
- " " " " multiplication circuit
-then, division.

-Transistor - electronic devise
-Shockley, Bardeen and Brattain
-Bell Labs
-Nobel Prize in Physics in 1956.
-used new technology - material called semiconductor.
-many uses - also was new switch.
-three wires:
-1 Source
-2 Sink
-3 Gate
-current flows from Source to Sink.
-signal - voltage or current on Gate.
-if signal, then electricity flows and switch is ON.
-if no signal on Gate, then no electricity and is OFF.

-transistor evolved over 60-years since created:
1. Smaller size - 1st transistor big size - inches in size.
-TI sells 1st commercial trans. - shrinks to size of stamp.
-still size was limitation - individ transistor components needed to be smaller cuz needed more transistors.
-integrated circuits - Kilby of TI.
-'58 - '59
-2000 Nobel Prize in Physics.
-contiguous piece semiconductor material.
-multiple transistors on this.
-allows many transistors embedded on single piece of material.
-this allows more complex functional circuit in very small area.
-Intel 4004 -'71 - Intel manuf first compolete CPU on single chip.
-size of fingernail.
-transistor size of 10 microns = single fiber cotton/silk.
-shrinking continues.
-IBM gate is 50 nanometers = 50x10-9 meters = 10 x's smaller than single wavelenth of visible light.
-thickness of cell membrane.
-10 x's of single turn of DNA helix.

2. Quantity and function -
-cuz smaller, more transistors on chip - see above.
-Gordon Moore - predicts trend.
-1965 - predicts that no. of transistors placed on single chip will double every 2-years.
-very accurate.
-called "Moore's Law"
-more transistors in CPU means more functionality.
-ALSO, no. of CPU's on single-chip increased.
-common chip is quad-core processor - contains 4 complete CPU's.

3. Faster -
-smaller transist also faster transist.
-so smaller is doubling factor: - get more of them and each one faster.
-gigahertz - measure speed of computer.
-speed = how many GHz computer will run.
-1 GHz = clock emits signal every nanosecond.
-universal speed limit - speed of light.
-186,282 mi/second.
-electrical signal is same speed.
-SO, in 1 nanosecond (1 GHz) - travels 11.8 inches.
-@ 2 GHz - goes 5.9 inches.
-@ 4 GHz - goes 2.95 inches
-further doubling speed of present computers limited by distance electricity can travel.
-(more on speed below…)
-clock - not like wall clock.
-is like drummer.
-keeps rhythm/beat - coordinates rhythm of all other parts.
-faster drummer beats, faster bnd plays.
-role of clock is to coordinate components of CPU chip.
-emits signal - indicates that next operation to occur.
-w/ every 'beat' of clock, another 'cycle' occurs on chip.
-another set of operations can occur.
-fast clock runs, faster chip runs - more instructions executed.

-benchmarking - process of measuring computing operations.
-diff systems are better/worse.
-manufacturers list IPS.
-IPS - instructions per second.
-no. of instructions (addition) done every second.
-MIPS = millions of IPS's.
-since 2000, clock speed not increased dramatically.
-BUT, multiple CPUs on chip increases the IPS of CPU.

-0.7 - High-level Look at Modern Computer -
-architecture - describes various parts of computer and how interact.
-higher-level view of elements of computer.
-named after stored program model of John von Neumann.
-computer standard architecture:
1. Processor - computational heart of computer.
-CPU - central processing unit.
-parts of CPU:
1) ALU - arithmetic and logic unit.
-logical calculations done here.
2) cache - local, fast storage
3) bus - connections among components.
2. Main memory - where data stored for processor to use.
-also called RAM - random access memory retrievable in any order.
-use of non-volatile memory increasing - esp in portable devices.
3. Disk - permanent, non-volatile storage.
-hard drive.
-data moved from here to main memory before can be used by processor.
-relatively slow, mechanical devices.
-replaced by non-mechanical memory cards in portables.
4. Input/Output - devices convert external data into digital data AND vice versa.
-done for use and storage of data in computer.
5. Network - another input/output device.
-allows computer to communicate w/ other computers.

-ΔΔ- Simple operation
-ΔΔ- simple operation:
theSum = num1 + num2
-variables - readable names that contain values to be used in a program.
-theSum, num1 and num2.
-statement adds 2 numbers stored in num1 and num2.
-result, the Sum, stored on disk.
-instruction itself also resides on disk.
-How this works:
-FETCH-DECODE-EXECUTE-STORE CYCLE - fundamental for all computers.
-simple sequential process is basis for all modern computers.
-4 steps - done in lockstep to beat of clock.
1. fetch instruction - when processor ready to process an instruction, fetches from memory.
-if instr not in memory, but on disk, then first fetches from disk.
-so, ΔΔ moves from disk, thru memory, to CPU.
2. decode instruction - action of processor examining instruction.
-decodes it.
-sees operands num1 and num2, so fetches them from memory.
-if not in in memory, but on disk, memory first fetches them from disk.
3. execute operation - once processor has both the instruction and the operands, performs operation.
4. store result - after operation, processor stores resulting data in memory.
-at some point (before power shut off) data in memory stored on disk.
5. repeat - go back to no. 1. (fetch instruction) to get next instruction in the program.

-operations not complex.
-ALU can:
1- add
2- subtract
3- multiply
4- divide
5- compare values and choose next instruction based on comparison.
-processor can handle only 2-types of operands:
1- integers
2- floating points - fractional values represented in decimal notation.
-separate ALU for integers and separate one for floating-point no.s called FPU - Floating Point Unit.

-0.8 Representing Data - (pg 25 ebook)
-cuz underlying element of computer is switch (transistor), most obvious way to represent data values is binary.
-binary = 1 or 0.
-corresponds to physical nature of switch - ON/OFF.
-can represent no.s and other data as binary.

-BINARY DATA - digital computer is binary - base 2.
-we use base 10.
-Babylonians used base 60.
-source of modern time-keeping and angle measurement.
-binary - 2 digits - 0,1.
-two reasons for using binary -
1) hardware - switches
- ON/OFF is directly translated to 1 or 0.
- electronic transistors lend selves naturally to base two.
2) two digits easy to store and easy to operate on.
-storage devices in binary need a medium that has two states:
a) one
b) zero.
-anything w/ two states can become digital storage.
-high/low voltage
-left/right magnetism.
-ΔΔ- main memory has small capacitors that hold charge (1) or not (0).
-disks magnetized one way (1) or other (0).
-CDs and DVDs reflect lite one way (1) or other (0).
-manipulations of underlying data done simply w/ electronic gates - Boolean logic: and, or, not.
-logical circuits extremely small/fast - do calcs quickly/efficiently.
-bits - bit = BInary digiT
-ΔΔ- adder circuit.
-adds 2 bits.
-w/ logical circuits - sum = (A and (not B)) or ((not A) and B)).
-all other math operations built from this.
-choice made based on value of bit.
-SO, entire ALU and rest of computer built from logical operators: and, or, not.

-Working w/ binary -
-ΔΔ- 735 = 73510
-735 = 7 * 102 + 3 * 101 + 5 * 100
2 1 0
-in binary - only 2 digits.
-rightmost 3-places are: 4s, 2s, 1s instead of
100s, 10s, 1s.
-ΔΔ- 101 in binary (1012) = _ in base 10 (decimal).
- 1012 = 1 * 22 + 0 * 21 + 1 * 20
= 1 * 4 + 0 * 2 + 1 * 1
= 4 + 0 + 1
= 510
-decimal integers represented in binary.
-fractions represented using integers in scientific notation:
-1/8 = 0.125 = 125 * 10-3
-mantissa - 125
-exponent - -3
-both are integers can be stored in binary.

-binary fractions stored using binary matissas and binary exponents.
-no. of bits allocated to matissa and to exponents varies in diff computers.
-BUT both no.s stored together.
-Four important concepts re: representation:
1) all no.s in computer represented in binary.
2) there is limit re: how big integer can be stored in one unit of computer memory.
-32 or 64 bits of storage.
3) fractions represented in scientific notation.
-are approximations
4) everything converted to binary for manip and storage:

-most computers organize data into words.
-usually 32 bits in size.
-now 64-bit words more common.
-infinite no. integers.
-only 32 bits to use - SO limited no. of integers that can fit.

-positive/negative integers - can represent 232 w/ 32-bit word.
-a little over 4 billion integers.
-to represent pos and neg integers evenly, represented range is from -2billion to +2billion.

-64-bit represents 264 integers w/ 64-bit word.
- is 1.8 x 1019
- is 4 billion times more can be stored w/ 32-bit word.

- prob w/ fractions:
-between every pair Real no.s is infinite no. of Real no.s
-no matter how represent Real no.s in binary, is always approximation.
-ΔΔ- enter 19/5.0 in Py and will be 3.7999999999999998 instead 3.8.
-this is feature of storage in binary - not feature of Py.

-bits, bytes, words -
-bit - binary digit.
-byte - 8 bits.
-storage counted in bytes.
-2 GB of RAM = approx 2 billion bytes of memory.
-(actually 231 bytes = 2,147,483,648 bytes).
-use bytes for hxical reasons.
-words - built from bytes w/ each byte containing 8-bits.
-32-bit word is 4 bytes.

-Representing Letters -
-letters = characters.
-characters - what we see on printed page.
- mostly made from letters, digits, and punctuation.
-characters that are not printable -
-carriage return
-characters that controlled printers - form feed

-ASCII - American Standard Code for Information Interchange.
-first standardized set of computer characters.
-ASCII table (Appendix F) is mapping between no.s and characters.
-ea character has associated no.
-"A" is 65, "a" is 97.
-can interpret set of no.s as characters and then manipulate characters as would no.s.
-ASCII developed by English speakers.
-computing spreads and so need richer character set for other langs.
-Unicode - developed to provide more room to represent the many characters in all langs.
-esp Chinese - has more than thousand characters.

-Py does Unicode and ASCII.

-Representing Other Data -
1. Images -
-discrete image - built of many individ parts.
-can represent individ parts as no.s
-pixels - "picture elements"
-small dots used to create image.
-thousands of them.
-ea pixel represented as location on 2-dimen grid.
-location is 2 numbers.
-ea pixel has associated color.
-color is also no.
-colors broken into contribution from:
-color number - represents how much of ea basic color contributed to final color.
-8-bit color scheme - ea color can contribute 8-bits or 28 = 256 possible shades of the color.
-SO, 24-bit color system gives 224 diff colors = 16,777,216.

-quantity of pixels important.
-standard analog TV was 525 lines pixels w/ ea line containing 480 = 252,000.
-HD TV has 1920 x 1080 = 2,073,600.

-Music -
-2 types of musical sound:
1) recorded sound -
-sound wave - complex wave of air pressure detected w/ ears.
-can see how complex w/ oscilloscope.
-record height of wave @ very high rate - (ΔΔ- 44,100 times/second = 44kHz = sampling rate on most MP3s)
-record height as no. for that time in sound wave.
-SO, record 2 no.s:
1- height of recorded sound.
2- time when height recorded.
-when play back, reproduce sound wave shape by creating new sound wave that has same height @ selected times as recorded sound @ ea. pt in time.
-more often 'sample' sound wave, better can reproduce accurately.
2) generated sound - to generate new sound, write comp programs that generate same data:
-height of wave
-some point in time
-then data played just like recorded sound.

-What numbers represent -
-all data recorded for use by computer represented as no.
-ΔΔ's- numeric, text, image, audio.
-can't look @ bits to tell what particular recorded data value represents.
-depends on use for which data value was recorded.
-depends on TYPE of the data.
-TYPE of data - indicates for what use data values are to be used.
-ΔΔ's diff data types:
-characters - then no.s represent ASCII values.
-floating-point - no.s represent mantissa & exponent of value.
-image - no.s represents color of partic pixel & location of pixel.

-Quantities of Data -
-amount of data that commercial disk holds grown dramatically over years.
-some common terms:
-when these used for data quantities, meanings are bit odd.
-here's why:
-103 or 1000 is pretty close to 210 or 1024.
-(using 2 cuz binary, base 2)
-so are discussing powers of 2 and powers of 10.
-use this rule of thumb:
-any time multiplying by 210, is pretty close to multiplying by 103, SO use power of 10 prefixes (below) we're used to.
-3 powers of 10 roughly equivalent to 10 powers of 2 - 103 = 210
-kilobytes - KB
-(typically refers to 103 or 1000 of something).
-NOT really 1000 (103) bytes.
-IS 1024 (210) bytes.
-megabytes - MB
-(typically refers to 106 or 1-million of something).
-NOT really 1-million (106) bytes.
-IS 1,048,576 (220) bytes.
-gigabytes - GB
-(typically refers to 109 or 1-billion of something).
-NOT really 1-billion (109) bytes.
-IS 1,073,741,824 (230) bytes.

-NOW, largest standard size commercial disk is 1 terabyte.
-1 trillion bytes (1012).
-NEAR FUTURE, will be 1 petabyte.
-1 quadrillion bytes (1015).

-1956 - first disk intro'd by IBM.
-4 megabytes.
-35-years later - finally 1-gigabyte disk made.
-only 14-years later - 500-gigabyte disks.
-only 2-years later - 1-terabyte disks (= 1000 gigabytes)
-predicted for 2010 - 1-petabyte disk.

-Petabyte - 1-million gigabytes.
-book is about 1 megabyte data.
-read one book every day for life of 80-years < 30 gigabytes data.
-have 999,970 gigabytes left.

-picture about 4 megabytes per pic.
-look @ 100-images per day for life of 80-years = 30 terabytes.
-have 970,000 gigabytes left after all pics and books.
-music - MP3 is about 1 megabyte per minute.
-if listen 24/7 for 80-years, is 42 terabytes.
-music, pics and books still leaves 928,000 gigabytes.
-DVDs - is about 2 gigabytes per hour.
-petabyte holds 500,000 hours of video.
-record 24/7, will last for 57-years.

-Other Prefixes:
1. exa - 260 = 1,152,921,504,606,846,976
2. zeta - 270 = 1,180,591,620,717,411,303,424
3. yotta - 280 = 1,208,925,819,614,629,174,706,176

-2006 - total Inet traffic in US about 8.4 exabytes.
-video increases.
-SO, end of 2007, YouTube is 600 petabytes/year.
-amateur video - 10 exabytes per year.
-vid conferencing and Inet gaming increasing LOTS.
-these 2 could create the "Exaflood" of the Inet.
-estimate for 2015 = 1000 exabytes/year.
-this = 1 zetabyte.