Transcript

XII . FILE PROCESSINGEngr. Ranel O. Padon

PYTHON PROGRAMMING TOPICS

I• Introduction to Python Programming

II• Python Basics

III• Controlling the Program Flow

IV• Program Components: Functions, Classes, Packages, and Modules

V• Sequences (List and Tuples), and Dictionaries

VI• Object-Based Programming: Classes and Objects

VII• Customizing Classes and Operator Overloading

VIII• Object-Oriented Programming: Inheritance and Polymorphism

IX• Randomization Algorithms

X• Exception Handling and Assertions

XI• String Manipulation and Regular Expressions

XII• File Handling and Processing

XIII• GUI Programming Using Tkinter

FileProcessing

Data Hierarchy

File-Open Modes

Dissecting Files

The Power of Buffering

FILE HANDLING

variables offer only temporary storage of data

they are lost when they “goes out of scope” or

when the program terminates

FILE HANDLING

files are used for long-term retention of

large amounts of data, even after the program

that created the data terminates.

data maintained in files is called persistent data

FILE HANDLING | Data Hierarchy

Bit (“Binary digit”) => the smallest computer data item

Bit is a digit that can assume one of two values

FILE HANDLING | Data Hierarchy

Programming with low-level bit formats is tedious & boring.

use decimal digits, letters, and symbols instead.

FILE HANDLING | Data Hierarchy

Characters are made-up of digits, letters, and characters.

Characters are represented as combination of bits (bytes).

FILE HANDLING | Data Hierarchy

FILE HANDLING | Data Hierarchy

Field (Column) is a collection of characters,

represented as words.

Record (Row) is a collection of fields,

represented as a tuple, dictionary, instance of a class.

File (Table) is a collection of records,

implemented as sequential access or random-access.

Database (Folder) is a collection of files,

handled by DBMS softwares.

FILE HANDLING | Data Hierarchy

FILE HANDLING | Data Hierarchy

FILE HANDLING | open() & close()

magical_file.close()

magical_file = open(“file_name.txt” [, a|r|r+|w|w+] [, buffer_mode])

FILE HANDLING | Other Functions

FILE HANDLING | open()

Open Mode Read Write Appends Overwrites CreatesCursor @

Start

Cursor @

EOF

r

r+

w

w+

a

a+

FILE HANDLING | Common Modes

Open Mode Read Write Appends Overwrites CreatesCursor @

Start

Cursor @

EOF

r

w

FILE HANDLING | open()

“r” is the default file-open mode

open(“input.dat”) = open(“input.dat”, “r”)

FILE HANDLING | r

FILE HANDLING | r

FILE HANDLING | r

FILE HANDLING | w

try removing line #6

try removing "\n" in lines #3 and #4

FILE HANDLING | w

FILE HANDLING | with-as Keyword

FILE HANDLING | Parsing

Paninda.txt

FILE HANDLING | Parsing | split

FILE HANDLING | Parsing | split

FILE HANDLING | Parsing | csv

Paranormal_Sightings.csv

FILE HANDLING | Parsing | strip

FILE HANDLING | Parsing | strip

FILE HANDLING | Parsing & Classes

FILE HANDLING | Parsing & Classes

FILE HANDLING | Parsing & Classes

FILE HANDLING | Parsing & Classes 2

FILE HANDLING | Parsing & Classes 2

FILE HANDLING | Parsing & Classes 2

FILE HANDLING | Parsing & Classes 2

FILE HANDLING | HTML Parsing

MangJose.html

FILE HANDLING | HTML Parsing

MangJose.html

FILE HANDLING | HTML Parsing

FILE HANDLING | HTML Parsing

FILE HANDLING | HTML Parsing

FILE HANDLING | HTML Parsing

FILE HANDLING | HTML Parsing

FILE HANDLING | HTML Parsing

FILE HANDLING | r+, w+, a+

All of the "plus" modes allow reading and writing:

the main difference between them is where

we're positioned in the file.

“r+” puts us at the beginning

“w+” puts us at the beginning & the end,

because the file's truncated

“a+” puts us at the end.

FILE HANDLING | w+

FILE HANDLING | Buffering

FILE HANDLING | Buffering

“-1” is the default file-open buffering mode

open(“input.dat”) = open(“input.dat”, “r”, “-1”)

Flag Meaning

0 unbuffered

1 buffered line

n buffered with size n

-1 system default

FILE HANDLING | Creating A Big File!

FILE HANDLING | Unbuffered r

Then, let’s read that big file.

FILE HANDLING | Buffered r

Now, with the help of buffering.

FILE HANDLING | Buffered By Default

In other languages, like C or Java,

buffering is not the default mode.

FILE HANDLING | What else?

1. Random-Access Files: for fast searching/editing of records

* use the shelve module

* shelve.open()

2. Serialization: compressing file as objects for efficiency;

useful for transferring data (objects, sequences, etc)

across a network connection or saving states of a game

* use the pickle or cPickle module

* cPickle.dump(stringList_to_be_written, serialized_file)

* records = cPickle.load(serialized_file)

PRACTICE EXERCISE| MORSE CODE

PRACTICE EXERCISE| MC CHART

PRACTICE EXERCISE| MC CHART

PRACTICE EXERCISE| MORSE CODE

A. Read a file containing Filipino/English-language

phrases and encodes it into Morse code.

B. Read a Morse code file and converts it into the

Filipino/English-language equivalent.

Use one blank between each Morse-coded letter and three blanks between each Morse-coded word.

REFERENCES

Deitel, Deitel, Liperi, and Wiedermann - Python: How to Program (2001).

Disclaimer: Most of the images/information used here have no proper source

citation, and I do not claim ownership of these either. I don’t want to reinvent the

wheel, and I just want to reuse and reintegrate materials that I think are useful or

cool, then present them in another light, form, or perspective. Moreover, the

images/information here are mainly used for illustration/educational purposes only,

in the spirit of openness of data, spreading light, and empowering people with

knowledge.

top related