Best Python IDEs And Code Editors

Python is one of the famous high-level programming languages that was developed in 1991.

Python is mainly used for server-side web development, development of software, maths, scripting, and artificial intelligence. It works on multiple platforms like Windows, Mac, Linux, Raspberry Pi etc.

Before exploring more about Python IDE, we must understand what is an IDE!

Python IDE and Code Editors

What You Will Learn: [show]

What is Integrated Development Environment (IDE)

IDE stands for Integrated Development Environment.

IDE is basically a software pack that consist of equipment’s which are used for developing and testing the software. A developer throughout SDLC uses many tools like editors, libraries, compiling and testing platforms.

IDE helps to automate the task of a developer by reducing manual efforts and combines all the equipment’s in a common framework. If IDE is not present, then the developer has to manually do the selections, integrations and deployment process. IDE was basically developed to simplify the SDLC process, by reducing coding and avoiding typing errors.

In contrast to the IDE, some developers also prefer Code editors. Code Editor is basically a text editor where a developer can write the code for developing any software. Code editor also allows the developer to save small text files for the code.

In comparison to IDE, code editors are fast in operating and have a small size. In fact code editors possess the capability of executing and debugging code.

Most Popular Python IDE FAQs

Enlisted below are the most frequently asked questions on Python IDE and Code Editor.

Q  #1) What is IDE and Text or Code Editor?

Answer:

IDE is a development environment which provides many features like coding, compiling, debugging, executing, autocomplete, libraries, in one place for the developer’s thus making tasks simpler whereas Code editor is a platform for editing and modifying the code only.

Q #2) What is the difference between IDE and TEXT EDITOR?

Answer:

IDE and Text Editor can be used in the place of each other for developing any software. Text editor helps the programmer for writing scripts, modifying code or text etc.

But with IDE a programmer can perform several other functions as well like running and executing the code, controlling the version, debug, interpreting, compiling, auto-complete feature, auto linting function, pre-defined functions and in build terminal etc.

IDE can be considered as a development environment where a programmer can write the script, compile and debug the completing process.

IDE also has an integrated file management system and deployment tool. IDE provides support to SVN, CVS, FTP, SFTP, framework etc. Basically, a Text editor is a simple editor to edit the source code and it does not possess any integrated tools or packages.

One advantage of Text editor is that it allows modifying all types of files rather than specifying any particular language or types. Both play an important role in their respective situations when used.

Q #3) Why we need a good Python IDE and how to select one?

Answer:

There are a lot of benefits of using Python IDE like developing a better quality code, debugging feature, justifying why notebooks are handy, getting all the features like compiling and deploying, in one place by making it easier for the developer.

An ideal IDE selection is purely based on the developer requirement like if a developer has to code in multiple languages or any highlighting of syntax or any product compilation is required or more extensibility and the integrated debugger is required or any drag-drop GUI layout is required or features like autocomplete and class browsers are required.

***************

=> Contact us to suggest your listing here

***************

Comparison Table

IDEUser RatingSize in MBDeveloped in
PyCharm
4.5/5BIGJAVA, PYTHON
Spyder
May 4, 2018BIGPYTHON
PyDev
4.6/5MEDIUMJAVA, PYTHON
Idle
4.2/5MEDIUMPYTHON
Wing
May 4, 2018BIGC, C++, PYTHON

Top Python IDEs And Code Editors Comparison

There are several Python IDE and Code editors that are discussed in this article and all the information that is required to choose the best IDE for your organization are explained here.

#1) PyCharm

PyCharm

Type: IDE.

Price: US $ 199 per User – 1st year for Professional Developer.

Platform Support: WINDOWS, LINUX, MAC etc.

Screenshots For Reference:

PyCharm screenshots1
PyCharm screenshots2

PyCharm is one of the widely used Python IDE which was created by Jet Brains. It is one of the best IDE for Python. PyCharm is all a developer’s need for productive Python development.

With PyCharm, the developers can write a neat and maintainable code. It helps to be more productive and gives smart assistance to the developers. It takes care of the routine tasks by saving time and thereby increasing profit accordingly.

Best Features:

  1. It comes with an intelligent code editor, smart code navigation, fast and safe refactoring’s.
  2. PyCharm is integrated with features like debugging, testing, profiling, deployments, remote development and tools of the database.
  3. With Python, PyCharm also provides support to python web development frameworks, JavaScript, HTML, CSS, Angular JS and Live edit features.
  4. It has a powerful integration with IPython Notebook, python console, and scientific stack.

Pros:

  1. It provides a smart platform to the developers who help them when it comes to auto code completion, error detection, quick fixing etc.
  2. It provides multiple framework support by increasing a lot of cost-saving factors.
  3. It supports a rich feature like cross-platform development so that the developers can write a script on different platforms as well.
  4. PyCharm also comes with a good feature of the customizable interface which in turn increases the productivity.

Cons:

  1. PyCharm is an expensive tool while considering the features and the tools it provides to the client.
  2. The initial installation is difficult and may hang up in between sometimes.

Official URL: Pycharm

#2) Spyder

Spyder

Type: IDE.

Price: Open Source

Platform Support: QT, WINDOWS, LINUX, MAC OS etc.

Screenshots For Reference:

Spyder1
Spyder2

SPYDER is another big name in the IDE market. It is a good python compiler.

It is famous for python development. It was mainly developed for scientists and engineers to provide a powerful scientific environment for Python. It offers an advanced level of edit, debug, and data exploration feature. It is very extensible and has a good plugin system and API.

As SPYDER uses PYQT, a developer can also use it as an extension. It is a powerful IDE.

Best Features:

  1. It is a good IDE with syntax highlighting, auto code completion feature.
  2. SPYDER is capable of exploring and editing variables from GUI itself.
  3. It works perfectly fine in multi-language editor along functions and auto code completion etc.
  4. It has a powerful integration with ipython Console, interacts and modifies the variables on the go as well, hence a developer can execute the code line by line or by the cell.

Pros:

  1. It is very efficient in finding and eliminating the bottlenecks to unchain the code performance.
  2. It has a powerful debugger to trace each step of the script execution smoothly.
  3. It has a good support feature to instantly view any object documents and modify your own documents.
  4. It also supports extended plugins to improvise its functionality to the new level.

Cons:

  1. It is not capable of configuring which warning the developer wants to disable.
  2. Its performance reduces when too many plugins are invoked at the same time.

Official URL: SPYDER

#3) Pydev

PyDev

Type: IDE

Price: Open Source

Platform Support: QT, WINDOWS, LINUX, MAC OS etc.

Screenshots For Reference:

PyDev screenshot1
PyDev screenshot2
PyDev screenshot3

PyDev is an outside plugin for Eclipse.

It is basically an IDE that is used for Python development. It is linear in size. It mainly focuses on the refactoring of python code, debugging in the graphical pattern, analysis of code etc. It is a strong python interpreter.

As it’s a plugin for eclipse it becomes more flexible for the developers to use the IDE for development of an application with so many features. In open source IDE, it is one of the preferred IDE by the developers.

Best Features:

  1. It is a nice IDE with Django integration, auto code completion and code coverage feature.
  2. It supports some rich features like type hinting, refactoring, debugging, and code analysis.
  3. PyDev supports PyLint integration, tokens browser, interactive console, Unittest integration, and remote debugger etc.
  4. It also supports Mypy, black formatter, virtual environments, and analyzing f-strings.

Pros:

  1. PyDev provides a strong syntax high lighting, parser errors, code folding, and multi-language support.
  2. It has a good outline view, it marks occurrences as well and has an interactive console.
  3. It has good support for CPython, Jython, Iron Python, and Django and allows interactive probing in suspended mode.
  4. It provides tabs preferences, smart indent, Pylint integration, TODO tasks, auto-completion of keywords and content assistants.

Cons:

  1. Sometimes the plugins in PyDev become unstable by creating issues in the development of the application.
  2. Performance of PyDev IDE decreases if the application is too big with multiple plugins.

Official URL: PyDev

#4) Idle

PyDev

Type: IDE.

Price: Open Source.

Platform Support: WINDOWS, LINUX, MAC OS etc.

Screenshots For Reference:

PyDev SCREENSHOT1
PyDev SCREENSHOT2

IDLE is a popular Integrated Development Environment written in Python and it has been integrated with the default language. It is one of the best IDE for python.

IDLE is a very simple and basic IDE which is mainly used by the beginner level developers who want to practice on python development. It is also a cross-platform thus helping the trainee developers a lot but it also called as a disposable IDE as a developer moves to more advance IDE after learning the basics.

Best Features:

  1. IDLE is developed purely in Python with the usage of Tkinter GUI toolkit and is also a cross-platform thereby increasing the flexibility for developers.
  2. It has a good feature of multi-window text editor which has many features like call tips, smart indentation, undo and python colorizing.
  3. It has a powerful debugger with continuous breakpoints, global view, and local spaces.
  4. It also supports dialog boxes, browsers, and editable configurations.

Pros:

  1. IDLE also supports syntax highlighting, auto code completion and smart indentation like other IDE’s.
  2. It has a Python shell with a high lighter.
  3. Integrated debugger with call stack visibility which increases the performance of developers.
  4. In IDLE, a developer can search within any window, search through multiple files and replace within the windows editor.

Cons:

  1. It has some normal usage issues, sometimes it lacks focus, and the developer cannot directly copy to the dashboard.
  2. IDLE does not have the numbering of line option which is a very basic design of the interface.

Official URL: IDLE

#5) Wing

WING

Type: IDE

Price: US $ 95 to US $ 179 PER USER FOR COMMERCIAL USE.

Platform Support: WINDOWS, LINUX, MAC OS etc.

Screenshots For Reference:

WING screenshot1
WING screenshot2
WING screenshot3

Wing is also a popular and powerful IDE in today’s market with a lot of good features which the developers require for python development.

It comes with a strong debugger and smart editor that makes the interactive Python development speed, accurate and fun to perform. Wing also provides a 30-day trial version for the developers to have a taste on its features.

Best Features:

  1. Wing helps in moving around the code with go-to-definition, find the uses and symbol’s in the application, edit symbol index, source browser, and effective multiple-file search.
  2. It supports the test-driven development with unit test, pytest, and Django testing framework.
  3. It assists remote development and is customizable and extensible too.
  4. It also has auto code completion, the error is displayed in a feasible manner and line editing is also possible.

Pros:

  1. In case of expiration of trial version, Wing provides around 10 minutes to the developers to migrate their application.
  2. It has a source browser which helps to show all the variables which are used in the script.
  3. Wing IDE provides an additional exception handling tab which helps a developer to debug the code.
  4. It provides an extract function which is under the refactor panel and is also a good help for the developers for increasing performance.

Cons:

  1. It is not capable of supporting dark themes which many developers like to use.
  2. Wing interface can be intimidating at the starting and the commercial version is way too expensive.

Official URL: Wing

#6) Eric Python

Eric Python

Type: IDE.

Price: Open Source.

Platform Support: WINDOWS, LINUX, MAC OS etc.

Screenshots For Reference:

eric python1
Eric Python screenshot2
eric python3

Eric is powerful and is rich in feature Python IDE and editor which is developed in Python itself. Eric can be used on the daily activity purpose or for the professional developers as well.

It is developed on cross-platform QT toolkit which is integrated with flexible Scintilla editor. Eric has an integrated plugin system which provides a simple extension to the IDE functions.

Best Features:

  1. ERIC has many editors, configurable window layout, source code folding and call tips, error high lighting, and advanced search functions.
  2. It has an advanced project management facility, integrated class browser, version control, cooperation functions, and source code.
  3. It offers cooperation’s functions, inbuilt debugger, inbuilt task management, profiling and code coverage support.
  4. It supports application diagram’s, syntax highlighting and auto code completion feature.

Pros:

  1. ERIC allows integrated support for unittest, CORBA and google protobuf.
  2. It has a lot of wizards for regex, QT dialogs, and tools for previewing QT forms and translations by making the developer’s task easier.
  3. It supports web browsers and has a spell check library which avoids errors.
  4. It also supports localization and has a rope refactoring tool for development.

Cons:

  1. ERIC installation becomes clumsy sometimes and it does not have a simple and easy GUI.
  2. When the developers try to integrate too many plugins the productivity and performance of the IDE decreases.

Official URL: Eric Python

#7) Rodeo

Rodeo

Type: IDE.

Price: Open Source.

Platform Support: WINDOWS, LINUX, Mac OS etc.

Screenshots For Reference:

Rodeo
RODEO screenshots2

Rodeo is one of the best IDE for python that was developed for data science-related tasks like taking data and information from different resources and also plotting for issues.

It supports cross-platform functionality. It can also be used as an IDE for experimenting in an interactive manner.

Best Features:

  1. It supports all the functions which are required for data science or machine learning tasks like loading data and experimenting in some manner.
  2. It allows the developers to interact, compare data, inspect and plot.
  3. Rodeo provides a clean code, auto-completion of code, syntax high lighting, and IPython support to write the code faster.
  4. It also has visual file navigator, clicks and point the directories, package search makes it easier for a developer to get what they want.

Pros:

  1. It is a lightweight, highly customizable and intuitive development environment which makes it unique.
  2. It has both text editor and me Python console.
  3. It includes all the supporting documentation at the last tab for better understanding.
  4. It has Vim, Emacs mode and allows single or block execution of code.
  5. Rodeo can also auto-update its latest version.

Cons:

  1. It is not maintained properly.
  2. No extended support facilities from the company staff in case of issues.

Official URL: Rodeo

#8) Thonny

Thonny IDE

Type: IDE.

Price: Open Source.

Platform Support: WINDOWS, LINUX, Mac OS etc.

Screenshots For Reference:

Thonny screenshot1
Thonny screenshot2
Thonny screenshot3

Thonny IDE is one of the best IDE for the beginner’s who have no prior Python experience to learn Python development.

It is very basic and simple in terms of features which even the new developers easily understand. It is very helpful for the users who use the virtual environment.

Best Features:

  1. Thonny provides the ability to the users to check how the programs and shell commands affect the python variables.
  2. It provides a simple debugger with F5, F6 and F7 function keys for debugging.
  3. It offers the ability to a user to see how python internally evaluates the written expression.
  4. It also supports the good representation of function calls, highlighting errors and auto code completion feature.

Pros:

  1. It has a very simple and clean Graphical user interface.
  2. It is very friendly for the beginners and takes care of PATH and issues with other python interpreters.
  3. The user has the ability to change the mode for explaining the reference.
  4. It helps to explain the scopes by highlighting the spots.

Cons:

  1. The interface design is not at all good and is limited to text editing and also has a lack of support for templates.
  2. Creation of plugin is really slow and there are many features which are lacking for developers.

Official URL: Thonny

Best Python Code Editors

Code editors are basically the text editors which are used to edit the source code as per the requirements.

These may be integrated or stand-alone applications. As they are monofunctional, they are very faster too. Enlisted below are some of the top code editors which are preferred by the Python developer’s world-wide.

#1) Sublime Text

Sublime Text

Type: Source Code Editor.

Price: USD $80.

Platform Support: WINDOWS, LINUX, Mac OS etc.

Screenshots For Reference:

Sublime Text screenshot1
Sublime Text screenshot2

Sublime Text is a very popular cross-platform text editor developed on C++ and Python and also have a Python API.

It is developed in such a manner that it supports many other programming and markup languages. It allows a user to add other functions with the help of plugins. It is more reliable when compared to the other code editors as the per developers review.

Best Features:

  1. Sublime text has GOTO anything for opening files with few clicks and can navigate to words or symbols.
  2. It has a strong feature of multiple selections to change many things at one time and also a command palette to sort, change the syntax, change indentation etc.
  3. It has high performance, powerful API and package ecosystem.
  4. It is highly customizable, allows split editing, allows instant project switch and is also cross-platform.

Pros:

  1. It has good compatibility with language grammars.
  2. It allows a user to choose specific preference related to projects.
  3. It also has a GOTO Definition feature to generate an application wide index of each method, class, and function.
  4. It shows high performance and has a powerful cross-platform User interface toolkit.

Cons:

  1. Sublime text can sometimes be intimidating to new users initially.
  2. It does not have a strong GIT plugin.

Official URL: Sublime Text

#2) Atom

Atom

Type: Source Code Editor.

Price: Open Source.

Platform Support: WINDOWS, LINUX, Mac OS etc.

Screenshots For Reference:

Atom screenshot1
Atom screenshot2

Atom is a free source code editor and is basically a desktop application which is built through a web technology having plugin support that is developed in Node.js.

It is based on atom shells which are a framework that helps to achieve cross-platform functionality. The best thing is that is can also be used as an Integrated Development Environment.

Best Features:

  1. Atom works on cross-platform editing very smoothly thereby increasing the performance of its users.
  2. It also has a built-in package manager and file system browser.
  3. It helps the users to write script faster with a smart and flexible auto-completion.
  4. It supports multiple pane features, finds and replaces text across an application.

Pros:

  1. It is simple and really simple to use.
  2. Atom allows UI customization to its user.
  3. It has a lot of support from the crew at GitHub.
  4. It has a strong feature for quickly opening the file to retrieve data and information.

Cons:

  1. It takes more time to sort the configurations and plugins as it’s a browser-based app.
  2. Tabs are clumsy, reduces the performance and sometimes loads slowly.

Official URL: Atom

#3) Vim

Vim

Type: Source Code Editor.

Price: Open Source.

Platform Support: WINDOWS, LINUX, Mac OS, IOS, Android, UNIX, AmigaOS, MorphOS etc.

Screenshots For Reference:

Vim screenshot1
Vim screenshot2

Vim is a popular open source text editor which is used to create and modify any type of text and is highly configurable.

According to the developers, VIM is a very stable text editor and its quality of performance is increasing on each new release of it. Vim text editor can be used as command line interface as well as standalone application.

Best Features:

  1. VIM is very persistent and also have a multilevel undo tree.
  2. It comes with an extensive system of plugins.
  3. It provides a wide range of support for many programming languages and files.
  4. It has a powerful integration, search and replace functionality.

Pros:

  1. Vim provides two different modes to the user to work i.e. Normal mode and editing mode.
  2. It comes with its own scripting language which allows a user to modify behavior and custom functionality.
  3. It also supports the non-programming applications which every other editor does not have.
  4. Strings in VIM are nothing but command sequences so that the developer can save and again reuse them.

Cons:

  1. It is only a text edit tool and doesn’t have a different color for the pop up shown.
  2. It does not have an easy learning curve and becomes difficult to learn at the beginning.

Official URL: VIM

#4) Visual Studio Code

Visual Studio Code

Type: Source Code Editor.

Price: Open Source.

Platform Support: WINDOWS, LINUX, Mac OS etc.

Screenshots For Reference:

Visual Studio

Visual Studio Code is an open source code editor which was developed mainly for the development and debugging of latest web and cloud projects.

It is capable of combining both editor and good development features very smoothly. It is one of the major choices for python developers.

Best Features:

  1. It supports syntax highlighting and auto code complete feature with IntelliSense which completes syntax based on variable types, function definition etc.
  2. It has a powerful debugger and the user can debug from the editor itself.
  3. It has strong integration with GIT so that a user can perform GIT operations like push, commit straight from the editor itself.
  4. Visual studio is highly extensible and customizable through which we can add languages, debuggers, themes etc.

Pros:

  1. It provides multi-language support and many other functionalities which the other languages don’t possess.
  2. It has a good layout and smart interface.
  3. It allows the use of many plugins which a developer can get from the VS code marketplace for its customization.
  4. It supports the use of vertical orientation and multi-split window feature.

Cons:

  1. Searching with visual studio code is very slow.
  2. Initially, it takes an ample amount of time to launch.

Official URL: Visual Studio

Summary

We hope this article would have given you a clear picture of what Python IDE and Source Code Editors are.

What is the major difference between both of them and why Python developers use Python IDE for development of web or cloud applications? How the IDE’s are improving the performance of developers and thereby increase the profit.

The topmost Python IDE which is preferred by most of the developers worldwide is covered in this article. We have also seen the benefits and demerits of each IDE based on which the developers decide to select which IDE is best for their project.

Large Scale Business: As these industries have both Finance and manpower, they prefer IDE’s like PyCharm, Atom, Sublime Text, Wing etc., so that they can get all the features with extended support from the companies for all their issues.

Middle and Small Scale Business: As these industries lookout for tools which are Open source and cover most of the features, they mostly prefer Spyder, PyDev, IDEL, ERIC Python and visual studio code for their projects.

Regular Expression In Python

The module defines several functions, constants, and an exception. Some of the functions are simplified versions of the full-featured methods for compiled regular expressions. Most non-trivial applications always use the compiled form.

re.compile(patternflags=0)

Compile a regular expression pattern into a regular expression object, which can be used for matching using its match() and search() methods, described below.

The expression’s behavior can be modified by specifying a flags value. Values can be any of the following variables, combined using bitwise OR (the | operator).

The sequence

prog = re.compile(pattern)
result = prog.match(string)

is equivalent to

result = re.match(pattern, string)

but using re.compile() and saving the resulting regular expression object for reuse is more efficient when the expression will be used several times in a single program.

Note

The compiled versions of the most recent patterns passed to re.match()re.search() or re.compile() are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.re.DEBUG

Display debug information about compiled expression.re.Ire.IGNORECASE

Perform case-insensitive matching; expressions like [A-Z] will match lowercase letters, too. This is not affected by the current locale. To get this effect on non-ASCII Unicode characters such as ü and Ü, add the UNICODE flag.re.Lre.LOCALE

Make \w\W\b\B\s and \S dependent on the current locale.re.Mre.MULTILINE

When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline). By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.re.Sre.DOTALL

Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.re.Ure.UNICODE

Make the \w\W\b\B\d\D\s and \S sequences dependent on the Unicode character properties database. Also enables non-ASCII matching for IGNORECASE.

New in version 2.0.re.Xre.VERBOSE

This flag allows you to write regular expressions that look nicer and are more readable by allowing you to visually separate logical sections of the pattern and add comments. Whitespace within the pattern is ignored, except when in a character class, or when preceded by an unescaped backslash, or within tokens like *?(?: or (?P<...>. When a line contains a # that is not in a character class and is not preceded by an unescaped backslash, all characters from the leftmost such # through the end of the line are ignored.

This means that the two following regular expression objects that match a decimal number are functionally equal:

a = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)
b = re.compile(r"\d+\.\d*")

re.search(patternstringflags=0)

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

re.match(patternstringflags=0)

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.

Note that even in MULTILINE mode, re.match() will only match at the beginning of the string and not at the beginning of each line.

If you want to locate a match anywhere in string, use search() instead (see also search() vs. match()).

re.split(patternstringmaxsplit=0flags=0)

Split string by the occurrences of pattern. If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list. (Incompatibility note: in the original Python 1.5 release, maxsplit was ignored. This has been fixed in later releases.)

>>> re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split('(\W+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('\W+', 'Words, words, words.', 1)
['Words', 'words, words.']
>>> re.split('[a-f]+', '0a3B9', flags=re.IGNORECASE)
['0', '3', '9']

If there are capturing groups in the separator and it matches at the start of the string, the result will start with an empty string. The same holds for the end of the string:

>>> re.split('(\W+)', '...words, words...')
['', '...', 'words', ', ', 'words', '...', '']

That way, separator components are always found at the same relative indices within the result list (e.g., if there’s one capturing group in the separator, the 0th, the 2nd and so forth).

Note that split will never split a string on an empty pattern match. For example:

>>> re.split('x*', 'foo')
['foo']
>>> re.split("(?m)^$", "foo\n\nbar\n")
['foo\n\nbar\n']

Changed in version 2.7: Added the optional flags argument.

re.findall(patternstringflags=0)

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.

Note

Due to the limitation of the current implementation the character following an empty match is not included in a next match, so findall(r'^|\w+', 'two words') returns ['', 'wo', 'words'] (note missed “t”). This is changed in Python 3.7.

New in version 1.5.2.

Changed in version 2.4: Added the optional flags argument.

re.finditer(patternstringflags=0)

Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result. See also the note about findall().

New in version 2.2.

Changed in version 2.4: Added the optional flags argument.re.sub(patternreplstringcount=0flags=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a carriage return, and so forth. Unknown escapes such as \j are left alone. Backreferences, such as \6, are replaced with the substring matched by group 6 in the pattern. For example:

>>> re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
...        r'static PyObject*\npy_\1(void)\n{',
...        'def myfunc():')
'static PyObject*\npy_myfunc(void)\n{'

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string. For example:

>>> def dashrepl(matchobj):
...     if matchobj.group(0) == '-': return ' '
...     else: return '-'
>>> re.sub('-{1,2}', dashrepl, 'pro----gram-files')
'pro--gram files'
>>> re.sub(r'\sAND\s', ' & ', 'Baked Beans And Spam', flags=re.IGNORECASE)
'Baked Beans & Spam'

The pattern may be a string or an RE object.

The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer. If omitted or zero, all occurrences will be replaced. Empty matches for the pattern are replaced only when not adjacent to a previous match, so sub('x*', '-', 'abc') returns '-a-b-c-'.

In string-type repl arguments, in addition to the character escapes and backreferences described above, \g<name> will use the substring matched by the group named name, as defined by the (?P<name>...) syntax. \g<number> uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0\20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE.

Changed in version 2.7: Added the optional flags argument.re.subn(patternreplstringcount=0flags=0)

Perform the same operation as sub(), but return a tuple (new_string, number_of_subs_made).

Changed in version 2.7: Added the optional flags argument.re.escape(pattern)

Escape all the characters in pattern except ASCII letters and numbers. This is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it. For example:

>>> print re.escape('python.exe')
python\.exe

>>> legal_chars = string.ascii_lowercase + string.digits + "!#$%&'*+-.^_`|~:"
>>> print '[%s]+' % re.escape(legal_chars)
[abcdefghijklmnopqrstuvwxyz0123456789\!\#\$\%\&\'\*\+\-\.\^\_\`\|\~\:]+

>>> operators = ['+', '-', '*', '/', '**']
>>> print '|'.join(map(re.escape, sorted(operators, reverse=True)))
\/|\-|\+|\*\*|\*

re.purge()

Clear the regular expression cache.exception re.error

Exception raised when a string passed to one of the functions here is not a valid regular expression (for example, it might contain unmatched parentheses) or when some other error occurs during compilation or matching. It is never an error if a string contains no match for a pattern.

\number

Matches the contents of the group of the same number. Groups are numbered starting from 1. For example, (.+) \1 matches 'the the' or '55 55', but not 'thethe' (note the space after the group). This special sequence can only be used to match one of the first 99 groups. If the first digit of number is 0, or number is 3 octal digits long, it will not be interpreted as a group match, but as the character with octal value number. Inside the '[' and ']' of a character class, all numeric escapes are treated as characters.

\A

Matches only at the start of the string.

\b

Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of alphanumeric or underscore characters, so the end of a word is indicated by whitespace or a non-alphanumeric, non-underscore character. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string, so the precise set of characters deemed to be alphanumeric depends on the values of the UNICODE and LOCALE flags. For example, r'\bfoo\b' matches 'foo''foo.''(foo)''bar foo baz' but not 'foobar' or 'foo3'. Inside a character range, \b represents the backspace character, for compatibility with Python’s string literals.

\B

Matches the empty string, but only when it is not at the beginning or end of a word. This means that r'py\B' matches 'python''py3''py2', but not 'py''py.', or 'py!'\B is just the opposite of \b, so is also subject to the settings of LOCALE and UNICODE.

\d

When the UNICODE flag is not specified, matches any decimal digit; this is equivalent to the set [0-9]. With UNICODE, it will match whatever is classified as a decimal digit in the Unicode character properties database.

\D

When the UNICODE flag is not specified, matches any non-digit character; this is equivalent to the set [^0-9]. With UNICODE, it will match anything other than character marked as digits in the Unicode character properties database.

\s

When the UNICODE flag is not specified, it matches any whitespace character, this is equivalent to the set [ \t\n\r\f\v]. The LOCALE flag has no extra effect on matching of the space. If UNICODE is set, this will match the characters [ \t\n\r\f\v] plus whatever is classified as space in the Unicode character properties database.

\S

When the UNICODE flag is not specified, matches any non-whitespace character; this is equivalent to the set [^ \t\n\r\f\v] The LOCALE flag has no extra effect on non-whitespace match. If UNICODE is set, then any character not marked as space in the Unicode character properties database is matched.

\w

When the LOCALE and UNICODE flags are not specified, matches any alphanumeric character and the underscore; this is equivalent to the set [a-zA-Z0-9_]. With LOCALE, it will match the set [0-9_] plus whatever characters are defined as alphanumeric for the current locale. If UNICODE is set, this will match the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database.

\W

When the LOCALE and UNICODE flags are not specified, matches any non-alphanumeric character; this is equivalent to the set [^a-zA-Z0-9_]. With LOCALE, it will match any character not in the set [0-9_], and not defined as alphanumeric for the current locale. If UNICODE is set, this will match anything other than [0-9_] plus characters classified as not alphanumeric in the Unicode character properties database.

\Z

Matches only at the end of the string.

'.'

(Dot.) In the default mode, this matches any character except a newline. If the DOTALL flag has been specified, this matches any character including a newline.

'^'

(Caret.) Matches the start of the string, and in MULTILINE mode also matches immediately after each newline.

'$'

Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline. foo matches both ‘foo’ and ‘foobar’, while the regular expression foo$ matches only ‘foo’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘foo2’ normally, but ‘foo1’ in MULTILINE mode; searching for a single $ in 'foo\n' will find two (empty) matches: one just before the newline, and one at the end of the string.

'*'

It causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. ab* will match ‘a’, ‘ab’, or ‘a’ followed by any number of ‘b’s.

'+'

Causes the resulting RE to match 1 or more repetitions of the preceding RE. ab+ will match ‘a’ followed by any non-zero number of ‘b’s; it will not match just ‘a’.

'?'

Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. ab? will match either ‘a’ or ‘ab’.

*?+???

The '*''+', and '?' qualifiers are all greedy; they match as much text as possible. Sometimes this behaviour isn’t desired; if the RE <.*> is matched against <a> b <c>, it will match the entire string, and not just <a>. Adding ? after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using the RE <.*?> will match only <a>.

{m}

Specifies that exactly m copies of the previous RE should be matched; fewer matches cause the entire RE not to match. For example, a{6} will match exactly six 'a' characters, but not five.

{m,n}

Causes the resulting RE to match from m to n repetitions of the preceding RE, attempting to match as many repetitions as possible. For example, a{3,5} will match from 3 to 5 'a' characters. Omitting m specifies a lower bound of zero, and omitting n specifies an infinite upper bound. As an example, a{4,}b will match aaaab or a thousand 'a' characters followed by a b, but not aaab. The comma may not be omitted or the modifier would be confused with the previously described form.

{m,n}?

Causes the resulting RE to match from m to n repetitions of the preceding RE, attempting to match as few repetitions as possible. This is the non-greedy version of the previous qualifier. For example, on the 6-character string 'aaaaaa'a{3,5} will match 5 'a' characters, while a{3,5}? will only match 3 characters.

'\'

Either escapes special characters (permitting you to match characters like '*''?', and so forth), or signals a special sequence; special sequences are discussed below.

If you’re not using a raw string to express the pattern, remember that Python also uses the backslash as an escape sequence in string literals; if the escape sequence isn’t recognized by Python’s parser, the backslash and subsequent character are included in the resulting string. However, if Python would recognize the resulting sequence, the backslash should be repeated twice. This is complicated and hard to understand, so it’s highly recommended that you use raw strings for all but the simplest expressions.

[]

Used to indicate a set of characters. In a set:

  • Characters can be listed individually, e.g. [amk] will match 'a''m', or 'k'.
  • Ranges of characters can be indicated by giving two characters and separating them by a '-', for example [a-z] will match any lowercase ASCII letter, [0-5][0-9] will match all the two-digits numbers from 00 to 59, and [0-9A-Fa-f] will match any hexadecimal digit. If - is escaped (e.g. [a\-z]) or if it’s placed as the first or last character (e.g. [a-]), it will match a literal '-'.
  • Special characters lose their special meaning inside sets. For example, [(+*)] will match any of the literal characters '(''+''*', or ')'.
  • Character classes such as \w or \S (defined below) are also accepted inside a set, although the characters they match depends on whether LOCALE or UNICODE mode is in force.
  • Characters that are not within a range can be matched by complementing the set. If the first character of the set is '^', all the characters that are not in the set will be matched. For example, [^5] will match any character except '5', and [^^] will match any character except '^'^ has no special meaning if it’s not the first character in the set.
  • To match a literal ']' inside a set, precede it with a backslash, or place it at the beginning of the set. For example, both [()[\]{}] and []()[{}] will both match a parenthesis.

'|'

A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].

(...)

Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the \number special sequence, described below. To match the literals '(' or ')', use \( or \), or enclose them inside a character class: [(] [)].

(?...)

This is an extension notation (a '?' following a '(' is not meaningful otherwise). The first character after the '?' determines what the meaning and further syntax of the construct is. Extensions usually do not create a new group; (?P<name>...) is the only exception to this rule. Following are the currently supported extensions.

(?iLmsux)

(One or more letters from the set 'i''L''m''s''u''x'.) The group matches the empty string; the letters set the corresponding flags: re.I (ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), re.U (Unicode dependent), and re.X (verbose), for the entire regular expression. (The flags are described in Module Contents.) This is useful if you wish to include the flags as part of the regular expression, instead of passing a flag argument to the re.compile() function.

Note that the (?x) flag changes how the expression is parsed. It should be used first in the expression string, or after one or more whitespace characters. If there are non-whitespace characters before the flag, the results are undefined.

(?:...)

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

(?P<name>...)

Similar to regular parentheses, but the substring matched by the group is accessible via the symbolic group name name. Group names must be valid Python identifiers, and each group name must be defined only once within a regular expression. A symbolic group is also a numbered group, just as if the group were not named.

Named groups can be referenced in three contexts. If the pattern is (?P<quote>['"]).*?(?P=quote) (i.e. matching a string quoted with either single or double quotes):

Context of reference to group “quote”Ways to reference it
in the same pattern itself(?P=quote) (as shown)\1
when processing match object mm.group('quote')m.end('quote') (etc.)
in a string passed to the repl argument of re.sub()\g<quote>\g<1>\1

(?P=name)

A backreference to a named group; it matches whatever text was matched by the earlier group named name.(?#...)

A comment; the contents of the parentheses are simply ignored.(?=...)

Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.(?!...)

Matches if ... doesn’t match next. This is a negative lookahead assertion. For example, Isaac (?!Asimov) will match 'Isaac ' only if it’s not followed by 'Asimov'.(?<=...)

Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion(?<=abc)def will find a match in abcdef, since the lookbehind will back up 3 characters and check if the contained pattern matches. The contained pattern must only match strings of some fixed length, meaning that abc or a|b are allowed, but a* and a{3,4} are not. Group references are not supported even if they match strings of some fixed length. Note that patterns which start with positive lookbehind assertions will not match at the beginning of the string being searched; you will most likely want to use the search() function rather than the match() function:

>>> import re
>>> m = re.search('(?<=abc)def', 'abcdef')
>>> m.group(0)
'def'

This example looks for a word following a hyphen:

>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'

(?<!...)

Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length and shouldn’t contain group references. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.(?(id/name)yes-pattern|no-pattern)

Will try to match with yes-pattern if the group with given id or name exists, and with no-pattern if it doesn’t. no-pattern is optional and can be omitted. For example, (<)?(\[email protected]\w+(?:\.\w+)+)(?(1)>) is a poor email matching pattern, which will match with '<[email protected]>' as well as '[email protected]', but not with '<[email protected]'.

Python Exception Handling

Until now error messages haven’t been more than mentioned, but if you have tried out the examples you have probably seen some. There are (at least) two distinguishable kinds of errors: syntax errors and exceptions.

1. Syntax Errors

Syntax errors, also known as parsing errors, are perhaps the most common kind of complaint you get while you are still learning Python:>>>

>>> while True print('Hello world')
  File "<stdin>", line 1
    while True print('Hello world')
                   ^
SyntaxError: invalid syntax

The parser repeats the offending line and displays a little ‘arrow’ pointing at the earliest point in the line where the error was detected. The error is caused by (or at least detected at) the token preceding the arrow: in the example, the error is detected at the function print(), since a colon (':') is missing before it. File name and line number are printed so you know where to look in case the input came from a script.

2. Exceptions

Even if a statement or expression is syntactically correct, it may cause an error when an attempt is made to execute it. Errors detected during execution are called exceptions and are not unconditionally fatal: you will soon learn how to handle them in Python programs. Most exceptions are not handled by programs, however, and result in error messages as shown here:>>>

>>> 10 * (1/0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
>>> 4 + spam*3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'spam' is not defined
>>> '2' + 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't convert 'int' object to str implicitly

The last line of the error message indicates what happened. Exceptions come in different types, and the type is printed as part of the message: the types in the example are ZeroDivisionErrorNameError and TypeError. The string printed as the exception type is the name of the built-in exception that occurred. This is true for all built-in exceptions, but need not be true for user-defined exceptions (although it is a useful convention). Standard exception names are built-in identifiers (not reserved keywords).

The rest of the line provides detail based on the type of exception and what caused it.

The preceding part of the error message shows the context where the exception happened, in the form of a stack traceback. In general it contains a stack traceback listing source lines; however, it will not display lines read from standard input.

Built-in Exceptions lists the built-in exceptions and their meanings.

3. Handling Exceptions

It is possible to write programs that handle selected exceptions. Look at the following example, which asks the user for input until a valid integer has been entered, but allows the user to interrupt the program (using Control-C or whatever the operating system supports); note that a user-generated interruption is signalled by raising the KeyboardInterrupt exception.>>>

>>> while True:
...     try:
...         x = int(input("Please enter a number: "))
...         break
...     except ValueError:
...         print("Oops!  That was no valid number.  Try again...")
...

The try statement works as follows.

  • First, the try clause (the statement(s) between the try and except keywords) is executed.
  • If no exception occurs, the except clause is skipped and execution of the try statement is finished.
  • If an exception occurs during execution of the try clause, the rest of the clause is skipped. Then if its type matches the exception named after the except keyword, the except clause is executed, and then execution continues after the try statement.
  • If an exception occurs which does not match the exception named in the except clause, it is passed on to outer try statements; if no handler is found, it is an unhandled exception and execution stops with a message as shown above.

try statement may have more than one except clause, to specify handlers for different exceptions. At most one handler will be executed. Handlers only handle exceptions that occur in the corresponding try clause, not in other handlers of the same try statement. An except clause may name multiple exceptions as a parenthesized tuple, for example:

... except (RuntimeError, TypeError, NameError):
...     pass

A class in an except clause is compatible with an exception if it is the same class or a base class thereof (but not the other way around — an except clause listing a derived class is not compatible with a base class). For example, the following code will print B, C, D in that order:

class B(Exception):
    pass

class C(B):
    pass

class D(C):
    pass

for cls in [B, C, D]:
    try:
        raise cls()
    except D:
        print("D")
    except C:
        print("C")
    except B:
        print("B")

Note that if the except clauses were reversed (with except B first), it would have printed B, B, B — the first matching except clause is triggered.

The last except clause may omit the exception name(s), to serve as a wildcard. Use this with extreme caution, since it is easy to mask a real programming error in this way! It can also be used to print an error message and then re-raise the exception (allowing a caller to handle the exception as well):

import sys

try:
    f = open('myfile.txt')
    s = f.readline()
    i = int(s.strip())
except OSError as err:
    print("OS error: {0}".format(err))
except ValueError:
    print("Could not convert data to an integer.")
except:
    print("Unexpected error:", sys.exc_info()[0])
    raise

The try … except statement has an optional else clause, which, when present, must follow all except clauses. It is useful for code that must be executed if the try clause does not raise an exception. For example:

for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except OSError:
        print('cannot open', arg)
    else:
        print(arg, 'has', len(f.readlines()), 'lines')
        f.close()

The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn’t raised by the code being protected by the try … except statement.

When an exception occurs, it may have an associated value, also known as the exception’s argument. The presence and type of the argument depend on the exception type.

The except clause may specify a variable after the exception name. The variable is bound to an exception instance with the arguments stored in instance.args. For convenience, the exception instance defines __str__() so the arguments can be printed directly without having to reference .args. One may also instantiate an exception first before raising it and add any attributes to it as desired.>>>

>>> try:
...     raise Exception('spam', 'eggs')
... except Exception as inst:
...     print(type(inst))    # the exception instance
...     print(inst.args)     # arguments stored in .args
...     print(inst)          # __str__ allows args to be printed directly,
...                          # but may be overridden in exception subclasses
...     x, y = inst.args     # unpack args
...     print('x =', x)
...     print('y =', y)
...
<class 'Exception'>
('spam', 'eggs')
('spam', 'eggs')
x = spam
y = eggs

If an exception has arguments, they are printed as the last part (‘detail’) of the message for unhandled exceptions.

Exception handlers don’t just handle exceptions if they occur immediately in the try clause, but also if they occur inside functions that are called (even indirectly) in the try clause. For example:>>>

>>> def this_fails():
...     x = 1/0
...
>>> try:
...     this_fails()
... except ZeroDivisionError as err:
...     print('Handling run-time error:', err)
...
Handling run-time error: division by zero

4. Raising Exceptions

The raise statement allows the programmer to force a specified exception to occur. For example:>>>

>>> raise NameError('HiThere')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: HiThere

The sole argument to raise indicates the exception to be raised. This must be either an exception instance or an exception class (a class that derives from Exception). If an exception class is passed, it will be implicitly instantiated by calling its constructor with no arguments:

raise ValueError  # shorthand for 'raise ValueError()'

If you need to determine whether an exception was raised but don’t intend to handle it, a simpler form of the raise statement allows you to re-raise the exception:>>>

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print('An exception flew by!')
...     raise
...
An exception flew by!
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
NameError: HiThere

5. User-defined Exceptions

Programs may name their own exceptions by creating a new exception class (see Classes for more about Python classes). Exceptions should typically be derived from the Exception class, either directly or indirectly.

Exception classes can be defined which do anything any other class can do, but are usually kept simple, often only offering a number of attributes that allow information about the error to be extracted by handlers for the exception. When creating a module that can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions:

class Error(Exception):
    """Base class for exceptions in this module."""
    pass

class InputError(Error):
    """Exception raised for errors in the input.

    Attributes:
        expression -- input expression in which the error occurred
        message -- explanation of the error
    """

    def __init__(self, expression, message):
        self.expression = expression
        self.message = message

class TransitionError(Error):
    """Raised when an operation attempts a state transition that's not
    allowed.

    Attributes:
        previous -- state at beginning of transition
        next -- attempted new state
        message -- explanation of why the specific transition is not allowed
    """

    def __init__(self, previous, next, message):
        self.previous = previous
        self.next = next
        self.message = message

Most exceptions are defined with names that end in “Error”, similar to the naming of the standard exceptions.

Many standard modules define their own exceptions to report errors that may occur in functions they define. More information on classes is presented in chapter Classes.

6. Defining Clean-up Actions

The try statement has another optional clause which is intended to define clean-up actions that must be executed under all circumstances. For example:>>>

>>> try:
...     raise KeyboardInterrupt
... finally:
...     print('Goodbye, world!')
...
Goodbye, world!
KeyboardInterrupt
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>

If a finally clause is present, the finally clause will execute as the last task before the try statement completes. The finally clause runs whether or not the try statement produces an exception. The following points discuss more complex cases when an exception occurs:

  • If an exception occurs during execution of the try clause, the exception may be handled by an except clause. In all cases, the exception is re-raised after the finally clause has been executed.
  • An exception could occur during execution of an except or else clause. Again, the exception is re-raised after the finally clause has been executed.
  • If the try statement reaches a breakcontinue or return statement, the finally clause will execute just prior to the breakcontinue or return statement’s execution.
  • If a finally clause includes a return statement, the finally clause’s return statement will execute before, and instead of, the return statement in a try clause.

For example:>>>

>>> def bool_return():
...     try:
...         return True
...     finally:
...         return False
...
>>> bool_return()
False

A more complicated example:>>>

>>> def divide(x, y):
...     try:
...         result = x / y
...     except ZeroDivisionError:
...         print("division by zero!")
...     else:
...         print("result is", result)
...     finally:
...         print("executing finally clause")
...
>>> divide(2, 1)
result is 2.0
executing finally clause
>>> divide(2, 0)
division by zero!
executing finally clause
>>> divide("2", "1")
executing finally clause
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in divide
TypeError: unsupported operand type(s) for /: 'str' and 'str'

As you can see, the finally clause is executed in any event. The TypeError raised by dividing two strings is not handled by the except clause and therefore re-raised after the finally clause has been executed.

In real world applications, the finally clause is useful for releasing external resources (such as files or network connections), regardless of whether the use of the resource was successful.

7. Predefined Clean-up Actions

Some objects define standard clean-up actions to be undertaken when the object is no longer needed, regardless of whether or not the operation using the object succeeded or failed. Look at the following example, which tries to open a file and print its contents to the screen.

for line in open("myfile.txt"):
    print(line, end="")

The problem with this code is that it leaves the file open for an indeterminate amount of time after this part of the code has finished executing. This is not an issue in simple scripts, but can be a problem for larger applications. The with statement allows objects like files to be used in a way that ensures they are always cleaned up promptly and correctly.

with open("myfile.txt") as f:
    for line in f:
        print(line, end="")

After the statement is executed, the file f is always closed, even if a problem was encountered while processing the lines. Objects which, like files, provide predefined clean-up actions will indicate this in their documentation.

Python (interpreter) raises exceptions when it encounters errors. For example: divided by zero. In this article, you will learn about different exceptions that are built-in in Python.

When writing a program, we, more often than not, will encounter errors.

Error caused by not following the proper structure (syntax) of the language is called syntax error or parsing error.

>>> if a < 3  File "<interactive input>", line 1    if a < 3           ^SyntaxError: invalid syntax

We can notice here that a colon is missing in the if statement.

Errors can also occur at runtime and these are called exceptions. They occur, for example, when a file we try to open does not exist (FileNotFoundError), dividing a number by zero (ZeroDivisionError), module we try to import is not found (ImportError) etc.

Whenever these types of runtime errors occur, Python creates an exception object. If not handled properly, it prints a traceback to that error along with some details about why that error occurred.

>>> 1 / 0Traceback (most recent call last): File "<string>", line 301, in runcode File "<interactive input>", line 1, in <module>ZeroDivisionError: division by zero>>> open("imaginary.txt")Traceback (most recent call last): File "<string>", line 301, in runcode File "<interactive input>", line 1, in <module>FileNotFoundError: [Errno 2] No such file or directory: 'imaginary.txt'

Python Built-in Exceptions

Illegal operations can raise exceptions. There are plenty of built-in exceptions in Python that are raised when corresponding errors occur. We can view all the built-in exceptions using the local() built-in functions as follows.

>>> locals()['__builtins__']

This will return us a dictionary of built-in exceptions, functions and attributes.

Some of the common built-in exceptions in Python programming along with the error that cause then are tabulated below.

ExceptionCause of Error
AssertionErrorRaised when assert statement fails.
AttributeErrorRaised when attribute assignment or reference fails.
EOFErrorRaised when the input() functions hits end-of-file condition.
FloatingPointErrorRaised when a floating point operation fails.
GeneratorExitRaise when a generator’s close() method is called.
ImportErrorRaised when the imported module is not found.
IndexErrorRaised when index of a sequence is out of range.
KeyErrorRaised when a key is not found in a dictionary.
KeyboardInterruptRaised when the user hits interrupt key (Ctrl+c or delete).
MemoryErrorRaised when an operation runs out of memory.
NameErrorRaised when a variable is not found in local or global scope.
NotImplementedErrorRaised by abstract methods.
OSErrorRaised when system operation causes system related error.
OverflowErrorRaised when result of an arithmetic operation is too large to be represented.
ReferenceErrorRaised when a weak reference proxy is used to access a garbage collected referent.
RuntimeErrorRaised when an error does not fall under any other category.
StopIterationRaised by next() function to indicate that there is no further item to be returned by iterator.
SyntaxErrorRaised by parser when syntax error is encountered.
IndentationErrorRaised when there is incorrect indentation.
TabErrorRaised when indentation consists of inconsistent tabs and spaces.
SystemErrorRaised when interpreter detects internal error.
SystemExitRaised by sys.exit() function.
TypeErrorRaised when a function or operation is applied to an object of incorrect type.
UnboundLocalErrorRaised when a reference is made to a local variable in a function or method, but no value has been bound to that variable.
UnicodeErrorRaised when a Unicode-related encoding or decoding error occurs.
UnicodeEncodeErrorRaised when a Unicode-related error occurs during encoding.
UnicodeDecodeErrorRaised when a Unicode-related error occurs during decoding.
UnicodeTranslateErrorRaised when a Unicode-related error occurs during translating.
ValueErrorRaised when a function gets argument of correct type but improper value.
ZeroDivisionErrorRaised when second operand of division or modulo operation is zero.

We can also define our own exception in Python (if required). Visit this page to learn more about user-defined exceptions. 

We can handle these built-in and user-defined exceptions in Python using try, except and finally statements. 

File Handling In Python

There are always two parts of a file in the computer system, the filename and its extension. Also, the files have two key properties – its name and the location or path, which specifies the location where the file exists. The filename has two parts, and they are separated by a dot (.) or period.

Figure – File and its path:

Directory Structure

A built-in open method is used to create a Python file-object, which provides a connection to the file that is residing on programmer’s machine. After calling the function open, programmers can transfer strings of data to and from the external file that is residing in the machine.

File Opening In Python

open() function is used to open a file in Python. It’s mainly required two arguments, first the file name and then file opening mode.Syntax:

file_object = open(filename [,mode] [,buffering])

In the above syntax the parameters used are:

  • filename: It is the name of the file.
  • mode: It tells the program in which mode the file has to be open.
  • buffering: Here, if the value is set to zero (0), no buffering will occur while accessing a file, if the value is set to top one (1), line buffering will be performed while accessing a file.

Modes Of Opening File In Python

The file can be opened in the following modes:

ModeDescription
rOpens a file for reading only. (It’s a default mode.)
wOpens a file for writing. (If a file doesn’t exist already, then it creates a new file. Otherwise, it’s truncate a file.)
xOpens a file for exclusive creation. (Operation fails if a file does not exist in the location.)
aOpens a file for appending at the end of the file without truncating it. (Creates a new file if it does not exist in the location.)
tOpens a file in text mode. (It’s a default mode.)
bOpens a file in binary mode.
+Opens a file for updating (reading and writing.)

File Object Attributes

If an attempt to open a file fails then open returns a false value, otherwise it returns a file object that provides various information related to that file.Example:

# file opening example in Python
fo = open("sample.txt", "wb")
    print ("File Name: ", fo.name)
    print ("Mode of Opening: ", fo.mode)
    print ("Is Closed: ", fo.closed)
    print ("Softspace flag : ", fo.softspace)

Output:

File Name: sample.txt
Mode of Opening: wb
Is Closed: False
Softspace flag: 0

File Reading In Python

For reading and writing text data different text-encoding schemes are used such as ASCII (American Standard Code for Information Interchange), UTF-8 (Unicode Transformation Format), UTF-16.

Once a file is opened using open() method then it can be read by a method called read().Example:

# read the entire file as one string
with open('filename.txt') as f:
data = f.read()

# Iterate over the lines of the File
with open('filename.txt') as f:
for line in f :
   print(line, end=' ')
# process the lines

File Writing In Python

Similarly, for writing data to files, we have to use open() with ‘wt‘ mode, clearing and overwriting the previous content. Also, we have to use write() function to write into a file.Example:

# Write text data to a file
with open('filename.txt' , 'wt') as f:
    f.write ('hi there, this is a first line of file.\n')
    f.write ('and another line.\n')

Output:

hi there, this is a first line of file.
and another line.

By default, in Python – using the system default text encoding files are read/written. Though Python can understand several hundred text-encodings but the most common encoding techniques used are ASCII, Latin-1, UTF-8, UTF-16, etc. The use of ‘with’ statement in the example establishes a context in which the file will be used. As the control leaves the ‘with’ block, the file gets closed automatically.

Writing A File That Does Not Exist

The problem can be easily solved by using another mode – technique, i.e., the ‘x‘ mode to open a file instead of ‘w‘ mode.

Let’s see two examples to differentiate between them.Example:

with open('filename' , 'wt') as f:
    f.write ('Hello, This is sample content.\n')

# This will create an error that the file 'filename' doesn't exist.

with open ('filename.txt' , 'xt') as f:
    f.write ('Hello, This is sample content.\n')

In binary mode, we should use ‘xb‘ instead of ‘xt‘.

Closing A File In Python

In Python, it is not system critical to close all your files after using them, because the file will auto close after Python code finishes execution. You can close a file by using close() method.
Syntax:

file_object.close();

Example:

try:
   # Open a file
   fo = open("sample.txt", "wb")
   # perform file operations
finally:
   # Close opened file
   fo.close()

Python Oops Concept

  1. A Word About Names and Objects
  2. Python Scopes and Namespaces
    • 2.1. Scopes and Namespaces Example
  3. A First Look at Classes
    • 3.1. Class Definition Syntax
    • 3.2. Class Objects
    • 3.3. Instance Objects
    • 3.4. Method Objects
    • 3.5. Class and Instance Variables
  4. Random Remarks
  5. Inheritance
    • 5.1. Multiple Inheritance
  6. Private Variables
  7. Odds and Ends
  8. Iterators
  9. Generators
  10. Generator Expressions

Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.

Compared with other programming languages, Python’s class mechanism adds classes with a minimum of new syntax and semantics. It is a mixture of the class mechanisms found in C++ and Modula-3. Python classes provide all the standard features of Object-Oriented Programming: the class inheritance mechanism allows multiple base classes, a derived class can override any methods of its base class or classes, and a method can call the method of a base class with the same name. Objects can contain arbitrary amounts and kinds of data. As is true for modules, classes partake of the dynamic nature of Python: they are created at runtime and can be modified further after creation.

In C++ terminology, normally class members (including the data members) are public, and all member functions are virtual. As in Modula-3, there are no shorthands for referencing the object’s members from its methods: the method function is declared with an explicit first argument representing the object, which is provided implicitly by the call. As in Smalltalk, classes themselves are objects. This provides semantics for importing and renaming. Unlike C++ and Modula-3, built-in types can be used as base classes for extension by the user. Also, like in C++, most built-in operators with special syntax (arithmetic operators, subscribing, etc.) can be redefined for class instances.

(Lacking universally accepted terminology to talk about classes, I will make occasional use of Smalltalk and C++ terms. I would use Modula-3 terms, since its object-oriented semantics are closer to those of Python than C++, but I expect that few readers have heard of it.)

1. A Word About Names and Objects

Objects have individuality, and multiple names (in multiple scopes) can be bound to the same object. This is known as aliasing in other languages. This is usually not appreciated on a first glance at Python and can be safely ignored when dealing with immutable basic types (numbers, strings, tuples). However, aliasing has a possibly surprising effect on the semantics of Python code involving mutable objects such as lists, dictionaries, and most other types. This is usually used to the benefit of the program, since aliases behave like pointers in some respects. For example, passing an object is cheap since only a pointer is passed by the implementation; and if a function modifies an object passed as an argument, the caller will see the change — this eliminates the need for two different argument passing mechanisms as in Pascal.

2. Python Scopes and Namespaces

Before introducing classes, I first have to tell you something about Python’s scope rules. Class definitions play some neat tricks with namespaces, and you need to know how scopes and namespaces work to fully understand what’s going on. Incidentally, knowledge about this subject is useful for any advanced Python programmer.

Let’s begin with some definitions.

namespace is a mapping from names to objects. Most namespaces are currently implemented as Python dictionaries, but that’s normally not noticeable in any way (except for performance), and it may change in the future. Examples of namespaces are the set of built-in names (containing functions such as abs(), and built-in exception names); the global names in a module; and the local names in a function invocation. In a sense, the set of attributes of an object also forms a namespace. The important thing to know about namespaces is that there is absolutely no relation between names in different namespaces; for instance, two different modules may both define a function maximize without confusion — users of the modules must prefix it with the module name.

By the way, I use the word attribute for any name following a dot — for example, in the expression z.realreal is an attribute of the object z. Strictly speaking, references to names in modules are attribute references: in the expression modname.funcnamemodname is a module object and funcname is an attribute of it. In this case, there happens to be a straightforward mapping between the module’s attributes and the global names defined in the module: they share the same namespace! 1

Attributes may be read-only or writable. In the latter case, assignment to attributes is possible. Module attributes are writable: you can write modname.the_answer = 42. Writable attributes may also be deleted with the del statement. For example, del modname.the_answer will remove the attribute the_answer from the object named by modname.

Namespaces are created at different moments and have different lifetimes. The namespace containing the built-in names is created when the Python interpreter starts up, and is never deleted. The global namespace for a module is created when the module definition is read in; normally, module namespaces also last until the interpreter quits. The statements executed by the top-level invocation of the interpreter, either read from a script file or interactively, are considered part of a module called __main__, so they have their own global namespace. (The built-in names actually also live in a module; this is called builtins.)

The local namespace for a function is created when the function is called and deleted when the function returns or raises an exception that is not handled within the function. (Actually, forgetting would be a better way to describe what actually happens.) Of course, recursive invocations each have their own local namespace.

scope is a textual region of a Python program where a namespace is directly accessible. “Directly accessible” here means that an unqualified reference to a name attempts to find the name in the namespace.

Although scopes are determined statically, they are used dynamically. At any time during execution, there are at least three nested scopes whose namespaces are directly accessible:

  • the innermost scope, which is searched first, contains the local names
  • the scopes of any enclosing functions, which are searched starting with the nearest enclosing scope, contains non-local, but also non-global names
  • the next-to-last scope contains the current module’s global names
  • the outermost scope (searched last) is the namespace containing built-in names

If a name is declared global, then all references and assignments go directly to the middle scope containing the module’s global names. To rebind variables found outside of the innermost scope, the nonlocal statement can be used; if not declared nonlocal, those variables are read-only (an attempt to write to such a variable will simply create a new local variable in the innermost scope, leaving the identically named outer variable unchanged).

Usually, the local scope references the local names of the (textually) current function. Outside functions, the local scope references the same namespace as the global scope: the module’s namespace. Class definitions place yet another namespace in the local scope.

It is important to realize that scopes are determined textually: the global scope of a function defined in a module is that module’s namespace, no matter from where or by what alias the function is called. On the other hand, the actual search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution, at “compile” time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)

A special quirk of Python is that – if no global statement is in effect – assignments to names always go into the innermost scope. Assignments do not copy data — they just bind names to objects. The same is true for deletions: the statement del x removes the binding of x from the namespace referenced by the local scope. In fact, all operations that introduce new names use the local scope: in particular, import statements and function definitions bind the module or function name in the local scope.

The global statement can be used to indicate that particular variables live in the global scope and should be rebound there; the nonlocal statement indicates that particular variables live in an enclosing scope and should be rebound there.

2.1. Scopes and Namespaces Example

This is an example demonstrating how to reference the different scopes and namespaces, and how global and nonlocal affect variable binding:

def scope_test():
    def do_local():
        spam = "local spam"

    def do_nonlocal():
        nonlocal spam
        spam = "nonlocal spam"

    def do_global():
        global spam
        spam = "global spam"

    spam = "test spam"
    do_local()
    print("After local assignment:", spam)
    do_nonlocal()
    print("After nonlocal assignment:", spam)
    do_global()
    print("After global assignment:", spam)

scope_test()
print("In global scope:", spam)

The output of the example code is:

After local assignment: test spam
After nonlocal assignment: nonlocal spam
After global assignment: nonlocal spam
In global scope: global spam

Note how the local assignment (which is default) didn’t change scope_test’s binding of spam. The nonlocal assignment changed scope_test’s binding of spam, and the global assignment changed the module-level binding.

You can also see that there was no previous binding for spam before the global assignment.

3. A First Look at Classes

Classes introduce a little bit of new syntax, three new object types, and some new semantics.

3.1. Class Definition Syntax

The simplest form of class definition looks like this:

class ClassName:
    <statement-1>
    .
    .
    .
    <statement-N>

Class definitions, like function definitions (def statements) must be executed before they have any effect. (You could conceivably place a class definition in a branch of an if statement, or inside a function.)

In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes useful — we’ll come back to this later. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling conventions for methods — again, this is explained later.

When a class definition is entered, a new namespace is created, and used as the local scope — thus, all assignments to local variables go into this new namespace. In particular, function definitions bind the name of the new function here.

When a class definition is left normally (via the end), a class object is created. This is basically a wrapper around the contents of the namespace created by the class definition; we’ll learn more about class objects in the next section. The original local scope (the one in effect just before the class definition was entered) is reinstated, and the class object is bound here to the class name given in the class definition header (ClassName in the example).

3.2. Class Objects

Class objects support two kinds of operations: attribute references and instantiation.

Attribute references use the standard syntax used for all attribute references in Python: obj.name. Valid attribute names are all the names that were in the class’s namespace when the class object was created. So, if the class definition looked like this:

class MyClass:
    """A simple example class"""
    i = 12345

    def f(self):
        return 'hello world'

then MyClass.i and MyClass.f are valid attribute references, returning an integer and a function object, respectively. Class attributes can also be assigned to, so you can change the value of MyClass.i by assignment. __doc__ is also a valid attribute, returning the docstring belonging to the class: "A simple example class".

Class instantiation uses function notation. Just pretend that the class object is a parameterless function that returns a new instance of the class. For example (assuming the above class):

x = MyClass()

creates a new instance of the class and assigns this object to the local variable x.

The instantiation operation (“calling” a class object) creates an empty object. Many classes like to create objects with instances customized to a specific initial state. Therefore a class may define a special method named __init__(), like this:

def __init__(self):
    self.data = []

When a class defines an __init__() method, class instantiation automatically invokes __init__() for the newly-created class instance. So in this example, a new, initialized instance can be obtained by:

x = MyClass()

Of course, the __init__() method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to __init__(). For example,>>>

>>> class Complex:
...     def __init__(self, realpart, imagpart):
...         self.r = realpart
...         self.i = imagpart
...
>>> x = Complex(3.0, -4.5)
>>> x.r, x.i
(3.0, -4.5)

3.3. Instance Objects

Now what can we do with instance objects? The only operations understood by instance objects are attribute references. There are two kinds of valid attribute names, data attributes and methods.

data attributes correspond to “instance variables” in Smalltalk, and to “data members” in C++. Data attributes need not be declared; like local variables, they spring into existence when they are first assigned to. For example, if x is the instance of MyClass created above, the following piece of code will print the value 16, without leaving a trace:

x.counter = 1
while x.counter < 10:
    x.counter = x.counter * 2
print(x.counter)
del x.counter

The other kind of instance attribute reference is a method. A method is a function that “belongs to” an object. (In Python, the term method is not unique to class instances: other object types can have methods as well. For example, list objects have methods called append, insert, remove, sort, and so on. However, in the following discussion, we’ll use the term method exclusively to mean methods of class instance objects, unless explicitly stated otherwise.)

Valid method names of an instance object depend on its class. By definition, all attributes of a class that are function objects define corresponding methods of its instances. So in our example, x.f is a valid method reference, since MyClass.f is a function, but x.i is not, since MyClass.i is not. But x.f is not the same thing as MyClass.f — it is a method object, not a function object.

3.4. Method Objects

Usually, a method is called right after it is bound:

x.f()

In the MyClass example, this will return the string 'hello world'. However, it is not necessary to call a method right away: x.f is a method object, and can be stored away and called at a later time. For example:

xf = x.f
while True:
    print(xf())

will continue to print hello world until the end of time.

What exactly happens when a method is called? You may have noticed that x.f() was called without an argument above, even though the function definition for f() specified an argument. What happened to the argument? Surely Python raises an exception when a function that requires an argument is called without any — even if the argument isn’t actually used…

Actually, you may have guessed the answer: the special thing about methods is that the instance object is passed as the first argument of the function. In our example, the call x.f() is exactly equivalent to MyClass.f(x). In general, calling a method with a list of n arguments is equivalent to calling the corresponding function with an argument list that is created by inserting the method’s instance object before the first argument.

If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When a non-data attribute of an instance is referenced, the instance’s class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, a new argument list is constructed from the instance object and the argument list, and the function object is called with this new argument list.

3.5. Class and Instance Variables

Generally speaking, instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class:

class Dog:

    kind = 'canine'         # class variable shared by all instances

    def __init__(self, name):
        self.name = name    # instance variable unique to each instance

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.kind                  # shared by all dogs
'canine'
>>> e.kind                  # shared by all dogs
'canine'
>>> d.name                  # unique to d
'Fido'
>>> e.name                  # unique to e
'Buddy'

As discussed in A Word About Names and Objects, shared data can have possibly surprising effects with involving mutable objects such as lists and dictionaries. For example, the tricks list in the following code should not be used as a class variable because just a single list would be shared by all Dog instances:

class Dog:

    tricks = []             # mistaken use of a class variable

    def __init__(self, name):
        self.name = name

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks                # unexpectedly shared by all dogs
['roll over', 'play dead']

Correct design of the class should use an instance variable instead:

class Dog:

    def __init__(self, name):
        self.name = name
        self.tricks = []    # creates a new empty list for each dog

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks
['roll over']
>>> e.tricks
['play dead']

4. Random Remarks

Data attributes override method attributes with the same name; to avoid accidental name conflicts, which may cause hard-to-find bugs in large programs, it is wise to use some kind of convention that minimizes the chance of conflicts. Possible conventions include capitalizing method names, prefixing data attribute names with a small unique string (perhaps just an underscore), or using verbs for methods and nouns for data attributes.

Data attributes may be referenced by methods as well as by ordinary users (“clients”) of an object. In other words, classes are not usable to implement pure abstract data types. In fact, nothing in Python makes it possible to enforce data hiding — it is all based upon convention. (On the other hand, the Python implementation, written in C, can completely hide implementation details and control access to an object if necessary; this can be used by extensions to Python written in C.)

Clients should use data attributes with care — clients may mess up invariants maintained by the methods by stamping on their data attributes. Note that clients may add data attributes of their own to an instance object without affecting the validity of the methods, as long as name conflicts are avoided — again, a naming convention can save a lot of headaches here.

There is no shorthand for referencing data attributes (or other methods!) from within methods. I find that this actually increases the readability of methods: there is no chance of confusing local variables and instance variables when glancing through a method.

Often, the first argument of a method is called self. This is nothing more than a convention: the name self has absolutely no special meaning to Python. Note, however, that by not following the convention your code may be less readable to other Python programmers, and it is also conceivable that a class browser program might be written that relies upon such a convention.

Any function object that is a class attribute defines a method for instances of that class. It is not necessary that the function definition is textually enclosed in the class definition: assigning a function object to a local variable in the class is also ok. For example:

# Function defined outside the class
def f1(self, x, y):
    return min(x, x+y)

class C:
    f = f1

    def g(self):
        return 'hello world'

    h = g

Now fg and h are all attributes of class C that refer to function objects, and consequently they are all methods of instances of C — h being exactly equivalent to g. Note that this practice usually only serves to confuse the reader of a program.

Methods may call other methods by using method attributes of the self argument:

class Bag:
    def __init__(self):
        self.data = []

    def add(self, x):
        self.data.append(x)

    def addtwice(self, x):
        self.add(x)
        self.add(x)

Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing its definition. (A class is never used as a global scope.) While one rarely encounters a good reason for using global data in a method, there are many legitimate uses of the global scope: for one thing, functions and modules imported into the global scope can be used by methods, as well as functions and classes defined in it. Usually, the class containing the method is itself defined in this global scope, and in the next section we’ll find some good reasons why a method would want to reference its own class.

Each value is an object, and therefore has a class (also called its type). It is stored as object.__class__.

5. Inheritance

Of course, a language feature would not be worthy of the name “class” without supporting inheritance. The syntax for a derived class definition looks like this:

class DerivedClassName(BaseClassName):
    <statement-1>
    .
    .
    .
    <statement-N>

The name BaseClassName must be defined in a scope containing the derived class definition. In place of a base class name, other arbitrary expressions are also allowed. This can be useful, for example, when the base class is defined in another module:

class DerivedClassName(modname.BaseClassName):

Execution of a derived class definition proceeds the same as for a base class. When the class object is constructed, the base class is remembered. This is used for resolving attribute references: if a requested attribute is not found in the class, the search proceeds to look in the base class. This rule is applied recursively if the base class itself is derived from some other class.

There’s nothing special about instantiation of derived classes: DerivedClassName() creates a new instance of the class. Method references are resolved as follows: the corresponding class attribute is searched, descending down the chain of base classes if necessary, and the method reference is valid if this yields a function object.

Derived classes may override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class may end up calling a method of a derived class that overrides it. (For C++ programmers: all methods in Python are effectively virtual.)

An overriding method in a derived class may in fact want to extend rather than simply replace the base class method of the same name. There is a simple way to call the base class method directly: just call BaseClassName.methodname(self, arguments). This is occasionally useful to clients as well. (Note that this only works if the base class is accessible as BaseClassName in the global scope.)

Python has two built-in functions that work with inheritance:

  • Use isinstance() to check an instance’s type: isinstance(obj, int) will be True only if obj.__class__ is int or some class derived from int.
  • Use issubclass() to check class inheritance: issubclass(bool, int) is True since bool is a subclass of int. However, issubclass(float, int) is False since float is not a subclass of int.

5.1. Multiple Inheritance

Python supports a form of multiple inheritance as well. A class definition with multiple base classes looks like this:

class DerivedClassName(Base1, Base2, Base3):
    <statement-1>
    .
    .
    .
    <statement-N>

For most purposes, in the simplest cases, you can think of the search for attributes inherited from a parent class as depth-first, left-to-right, not searching twice in the same class where there is an overlap in the hierarchy. Thus, if an attribute is not found in DerivedClassName, it is searched for in Base1, then (recursively) in the base classes of Base1, and if it was not found there, it was searched for in Base2, and so on.

In fact, it is slightly more complex than that; the method resolution order changes dynamically to support cooperative calls to super(). This approach is known in some other multiple-inheritance languages as call-next-method and is more powerful than the super call found in single-inheritance languages.

Dynamic ordering is necessary because all cases of multiple inheritance exhibit one or more diamond relationships (where at least one of the parent classes can be accessed through multiple paths from the bottommost class). For example, all classes inherit from object, so any case of multiple inheritance provides more than one path to reach object. To keep the base classes from being accessed more than once, the dynamic algorithm linearizes the search order in a way that preserves the left-to-right ordering specified in each class, that calls each parent only once, and that is monotonic (meaning that a class can be subclassed without affecting the precedence order of its parents). Taken together, these properties make it possible to design reliable and extensible classes with multiple inheritance. For more detail, see https://www.python.org/download/releases/2.3/mro/.

6. Private Variables

“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.

Since there is a valid use-case for class-private members (namely to avoid name clashes of names with names defined by subclasses), there is limited support for such a mechanism, called name mangling. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, as long as it occurs within the definition of a class.

Name mangling is helpful for letting subclasses override methods without breaking intraclass method calls. For example:

class Mapping:
    def __init__(self, iterable):
        self.items_list = []
        self.__update(iterable)

    def update(self, iterable):
        for item in iterable:
            self.items_list.append(item)

    __update = update   # private copy of original update() method

class MappingSubclass(Mapping):

    def update(self, keys, values):
        # provides new signature for update()
        # but does not break __init__()
        for item in zip(keys, values):
            self.items_list.append(item)

The above example would work even if MappingSubclass were to introduce a __update identifier since it is replaced with _Mapping__update in the Mapping class and _MappingSubclass__update in the MappingSubclass class respectively.

Note that the mangling rules are designed mostly to avoid accidents; it still is possible to access or modify a variable that is considered private. This can even be useful in special circumstances, such as in the debugger.

Notice that code passed to exec() or eval() does not consider the classname of the invoking class to be the current class; this is similar to the effect of the global statement, the effect of which is likewise restricted to code that is byte-compiled together. The same restriction applies to getattr()setattr() and delattr(), as well as when referencing __dict__ directly.

7. Odds and Ends

Sometimes it is useful to have a data type similar to the Pascal “record” or C “struct”, bundling together a few named data items. An empty class definition will do nicely:

class Employee:
    pass

john = Employee()  # Create an empty employee record

# Fill the fields of the record
john.name = 'John Doe'
john.dept = 'computer lab'
john.salary = 1000

A piece of Python code that expects a particular abstract data type can often be passed a class that emulates the methods of that data type instead. For instance, if you have a function that formats some data from a file object, you can define a class with methods read() and readline() that get the data from a string buffer instead, and pass it as an argument.

Instance method objects have attributes, too: m.__self__ is the instance object with the method m(), and m.__func__ is the function object corresponding to the method.

8. Iterators

By now you have probably noticed that most container objects can be looped over using a for statement:

for element in [1, 2, 3]:
    print(element)
for element in (1, 2, 3):
    print(element)
for key in {'one':1, 'two':2}:
    print(key)
for char in "123":
    print(char)
for line in open("myfile.txt"):
    print(line, end='')

This style of access is clear, concise, and convenient. The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:>>>

>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    next(it)
StopIteration

Having seen the mechanics behind the iterator protocol, it is easy to add iterator behavior to your classes. Define an __iter__() method which returns an object with a __next__() method. If the class defines __next__(), then __iter__() can just return self:

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]

>>>

>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
...     print(char)
...
m
a
p
s

9. Generators

Generators are a simple and powerful tool for creating iterators. They are written like regular functions but use the yield statement whenever they want to return data. Each time next() is called on it, the generator resumes where it left off (it remembers all the data values and which statement was last executed). An example shows that generators can be trivially easy to create:

def reverse(data):
    for index in range(len(data)-1, -1, -1):
        yield data[index]

>>>

>>> for char in reverse('golf'):
...     print(char)
...
f
l
o
g

Anything that can be done with generators can also be done with class-based iterators as described in the previous section. What makes generators so compact is that the __iter__() and __next__() methods are created automatically.

Another key feature is that the local variables and execution state are automatically saved between calls. This made the function easier to write and much more clear than an approach using instance variables like self.index and self.data.

In addition to automatic method creation and saving program state, when generators terminate, they automatically raise StopIteration. In combination, these features make it easy to create iterators with no more effort than writing a regular function.

10. Generator Expressions

Some simple generators can be coded succinctly as expressions using a syntax similar to list comprehensions but with parentheses instead of square brackets. These expressions are designed for situations where the generator is used right away by an enclosing function. Generator expressions are more compact but less versatile than full generator definitions and tend to be more memory friendly than equivalent list comprehensions.

Examples:>>>

>>> sum(i*i for i in range(10))                 # sum of squares
285

>>> xvec = [10, 20, 30]
>>> yvec = [7, 5, 3]
>>> sum(x*y for x,y in zip(xvec, yvec))         # dot product
260

>>> from math import pi, sin
>>> sine_table = {x: sin(x*pi/180) for x in range(0, 91)}

>>> unique_words = set(word  for line in page  for word in line.split())

>>> valedictorian = max((student.gpa, student.name) for student in graduates)

>>> data = 'golf'
>>> list(data[i] for i in range(len(data)-1, -1, -1))
['f', 'l', 'o', 'g']

Python Control Flow

This tutorial will discuss how the python interpreter shares the processing among the source code. To prioritize the control python used below keywords to direct the control flow.

  • 1. if Statements
  • 2. for Statements
  • 3. The range() Function
  • 4. break and continue Statements, and else Clauses on Loops
  • 5. pass Statements
  • 6. Defining Functions
  • 7. More on Defining Functions
  • 7.1. Default Argument Values
  • 7.2. Keyword Arguments
  • 7.3. Arbitrary Argument Lists
  • 7.4. Unpacking Argument Lists
  • 7.5. Lambda Expressions
  • 7.6. Documentation Strings
  • 7.7. Function Annotations

1. if Statements

Perhaps the most well-known statement type is the if statement. For example:>>>

>>> x = int(input("Please enter an integer: "))
Please enter an integer: 42
>>> if x < 0:
...     x = 0
...     print('Negative changed to zero')
... elif x == 0:
...     print('Zero')
... elif x == 1:
...     print('Single')
... else:
...     print('More')
...
More

There can be zero or more elif parts, and the else part is optional. The keyword ‘elif’ is short for ‘else if’, and is useful to avoid excessive indentation. An if … elif … elif … sequence is a substitute for the switch or case statements found in other languages.

2. for Statements

The for statement in Python differs a bit from what you may be used to in C or Pascal. Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence. For example (no pun intended):>>>

>>> # Measure some strings:
... words = ['cat', 'window', 'defenestrate']
>>> for w in words:
...     print(w, len(w))
...
cat 3
window 6
defenestrate 12

If you need to modify the sequence you are iterating over while inside the loop (for example to duplicate selected items), it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy. The slice notation makes this especially convenient:>>>

>>> for w in words[:]:  # Loop over a slice copy of the entire list.
...     if len(w) > 6:
...         words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']

With for w in words:, the example would attempt to create an infinite list, inserting defenestrate over and over again.

3. The range() Function

If you do need to iterate over a sequence of numbers, the built-in function range() comes in handy. It generates arithmetic progressions:>>>

>>> for i in range(5):
...     print(i)
...
0
1
2
3
4

The given end point is never part of the generated sequence; range(10) generates 10 values, the legal indices for items of a sequence of length 10. It is possible to let the range start at another number, or to specify a different increment (even negative; sometimes this is called the ‘step’):

range(5, 10)
   5, 6, 7, 8, 9

range(0, 10, 3)
   0, 3, 6, 9

range(-10, -100, -30)
  -10, -40, -70

To iterate over the indices of a sequence, you can combine range() and len() as follows:>>>

>>> a = ['Mary', 'had', 'a', 'little', 'lamb']
>>> for i in range(len(a)):
...     print(i, a[i])
...
0 Mary
1 had
2 a
3 little
4 lamb

In most such cases, however, it is convenient to use the enumerate() function, see Looping Techniques.

A strange thing happens if you just print a range:>>>

>>> print(range(10))
range(0, 10)

In many ways the object returned by range() behaves as if it is a list, but in fact it isn’t. It is an object which returns the successive items of the desired sequence when you iterate over it, but it doesn’t really make the list, thus saving space.

We say such an object is iterable, that is, suitable as a target for functions and constructs that expect something from which they can obtain successive items until the supply is exhausted. We have seen that the for statement is such an iterator. The function list() is another; it creates lists from iterables:>>>

>>> list(range(5))
[0, 1, 2, 3, 4]

Later we will see more functions that return iterables and take iterables as argument.

4. break and continue Statements, and else Clauses on Loops

The break statement, like in C, breaks out of the innermost enclosing for or while loop.

Loop statements may have an else clause; it is executed when the loop terminates through exhaustion of the list (with for) or when the condition becomes false (with while), but not when the loop is terminated by a break statement. This is exemplified by the following loop, which searches for prime numbers:>>>

>>> for n in range(2, 10):
...     for x in range(2, n):
...         if n % x == 0:
...             print(n, 'equals', x, '*', n//x)
...             break
...     else:
...         # loop fell through without finding a factor
...         print(n, 'is a prime number')
...
2 is a prime number
3 is a prime number
4 equals 2 * 2
5 is a prime number
6 equals 2 * 3
7 is a prime number
8 equals 2 * 4
9 equals 3 * 3

(Yes, this is the correct code. Look closely: the else clause belongs to the for loop, not the if statement.)

When used with a loop, the else clause has more in common with the else clause of a try statement than it does that of if statements: a try statement’s else clause runs when no exception occurs, and a loop’s else clause runs when no break occurs. For more on the try statement and exceptions, see Handling Exceptions.

The continue statement, also borrowed from C, continues with the next iteration of the loop:>>>

>>> for num in range(2, 10):
...     if num % 2 == 0:
...         print("Found an even number", num)
...         continue
...     print("Found a number", num)
Found an even number 2
Found a number 3
Found an even number 4
Found a number 5
Found an even number 6
Found a number 7
Found an even number 8
Found a number 9

5. pass Statements

The pass statement does nothing. It can be used when a statement is required syntactically but the program requires no action. For example:>>>

>>> while True:
...     pass  # Busy-wait for keyboard interrupt (Ctrl+C)
...

This is commonly used for creating minimal classes:>>>

>>> class MyEmptyClass:
...     pass
...

Another place pass can be used is as a place-holder for a function or conditional body when you are working on new code, allowing you to keep thinking at a more abstract level. The pass is silently ignored:>>>

>>> def initlog(*args):
...     pass   # Remember to implement this!
...

6. Defining Functions

We can create a function that writes the Fibonacci series to an arbitrary boundary:>>>

>>> def fib(n):    # write Fibonacci series up to n
...     """Print a Fibonacci series up to n."""
...     a, b = 0, 1
...     while a < n:
...         print(a, end=' ')
...         a, b = b, a+b
...     print()
...
>>> # Now call the function we just defined:
... fib(2000)
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597

The keyword def introduces a function definition. It must be followed by the function name and the parenthesized list of formal parameters. The statements that form the body of the function start at the next line, and must be indented.

The first statement of the function body can optionally be a string literal; this string literal is the function’s documentation string, or docstring. (More about docstrings can be found in the section Documentation Strings.) There are tools which use docstrings to automatically produce online or printed documentation, or to let the user interactively browse through code; it’s good practice to include docstrings in code that you write, so make a habit of it.

The execution of a function introduces a new symbol table used for the local variables of the function. More precisely, all variable assignments in a function store the value in the local symbol table; whereas variable references first look in the local symbol table, then in the local symbol tables of enclosing functions, then in the global symbol table, and finally in the table of built-in names. Thus, global variables and variables of enclosing functions cannot be directly assigned a value within a function (unless, for global variables, named in a global statement, or, for variables of enclosing functions, named in a nonlocal statement), although they may be referenced.

The actual parameters (arguments) to a function call are introduced in the local symbol table of the called function when it is called; thus, arguments are passed using call by value (where the value is always an object reference, not the value of the object). 1 When a function calls another function, a new local symbol table is created for that call.

A function definition introduces the function name in the current symbol table. The value of the function name has a type that is recognized by the interpreter as a user-defined function. This value can be assigned to another name which can then also be used as a function. This serves as a general renaming mechanism:>>>

>>> fib
<function fib at 10042ed0>
>>> f = fib
>>> f(100)
0 1 1 2 3 5 8 13 21 34 55 89

Coming from other languages, you might object that fib is not a function but a procedure since it doesn’t return a value. In fact, even functions without a return statement do return a value, albeit a rather boring one. This value is called None (it’s a built-in name). Writing the value None is normally suppressed by the interpreter if it would be the only value written. You can see it if you really want to using print():>>>

>>> fib(0)
>>> print(fib(0))
None

It is simple to write a function that returns a list of the numbers of the Fibonacci series, instead of printing it:>>>

>>> def fib2(n):  # return Fibonacci series up to n
...     """Return a list containing the Fibonacci series up to n."""
...     result = []
...     a, b = 0, 1
...     while a < n:
...         result.append(a)    # see below
...         a, b = b, a+b
...     return result
...
>>> f100 = fib2(100)    # call it
>>> f100                # write the result
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

This example, as usual, demonstrates some new Python features:

  • The return statement returns with a value from a function. return without an expression argument returns None. Falling off the end of a function also returns None.
  • The statement result.append(a) calls a method of the list object result. A method is a function that ‘belongs’ to an object and is named obj.methodname, where obj is some object (this may be an expression), and methodname is the name of a method that is defined by the object’s type. Different types define different methods. Methods of different types may have the same name without causing ambiguity. (It is possible to define your own object types and methods, using classes, see Classes) The method append() shown in the example is defined for list objects; it adds a new element at the end of the list. In this example it is equivalent to result = result + [a], but more efficient.

7. More on Defining Functions

It is also possible to define functions with a variable number of arguments. There are three forms, which can be combined.

7.1. Default Argument Values

The most useful form is to specify a default value for one or more arguments. This creates a function that can be called with fewer arguments than it is defined to allow. For example:

def ask_ok(prompt, retries=4, reminder='Please try again!'):
    while True:
        ok = input(prompt)
        if ok in ('y', 'ye', 'yes'):
            return True
        if ok in ('n', 'no', 'nop', 'nope'):
            return False
        retries = retries - 1
        if retries < 0:
            raise ValueError('invalid user response')
        print(reminder)

This function can be called in several ways:

  • giving only the mandatory argument: ask_ok('Do you really want to quit?')
  • giving one of the optional arguments: ask_ok('OK to overwrite the file?', 2)
  • or even giving all arguments: ask_ok('OK to overwrite the file?', 2, 'Come on, only yes or no!')

This example also introduces the in keyword. This tests whether or not a sequence contains a certain value.

The default values are evaluated at the point of function definition in the defining scope, so that

i = 5

def f(arg=i):
    print(arg)

i = 6
f()

will print 5.

Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. For example, the following function accumulates the arguments passed to it on subsequent calls:

def f(a, L=[]):
    L.append(a)
    return L

print(f(1))
print(f(2))
print(f(3))

This will print

[1]
[1, 2]
[1, 2, 3]

If you don’t want the default to be shared between subsequent calls, you can write the function like this instead:

def f(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L

7.2. Keyword Arguments

Functions can also be called using keyword arguments of the form kwarg=value. For instance, the following function:

def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
    print("-- This parrot wouldn't", action, end=' ')
    print("if you put", voltage, "volts through it.")
    print("-- Lovely plumage, the", type)
    print("-- It's", state, "!")

accepts one required argument (voltage) and three optional arguments (stateaction, and type). This function can be called in any of the following ways:

parrot(1000)                                          # 1 positional argument
parrot(voltage=1000)                                  # 1 keyword argument
parrot(voltage=1000000, action='VOOOOOM')             # 2 keyword arguments
parrot(action='VOOOOOM', voltage=1000000)             # 2 keyword arguments
parrot('a million', 'bereft of life', 'jump')         # 3 positional arguments
parrot('a thousand', state='pushing up the daisies')  # 1 positional, 1 keyword

but all the following calls would be invalid:

parrot()                     # required argument missing
parrot(voltage=5.0, 'dead')  # non-keyword argument after a keyword argument
parrot(110, voltage=220)     # duplicate value for the same argument
parrot(actor='John Cleese')  # unknown keyword argument

In a function call, keyword arguments must follow positional arguments. All the keyword arguments passed must match one of the arguments accepted by the function (e.g. actor is not a valid argument for the parrot function), and their order is not important. This also includes non-optional arguments (e.g. parrot(voltage=1000) is valid too). No argument may receive a value more than once. Here’s an example that fails due to this restriction:>>>

>>> def function(a):
...     pass
...
>>> function(0, a=0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: function() got multiple values for keyword argument 'a'

When a final formal parameter of the form **name is present, it receives a dictionary (see Mapping Types — dict) containing all keyword arguments except for those corresponding to a formal parameter. This may be combined with a formal parameter of the form *name (described in the next subsection) which receives a tuple containing the positional arguments beyond the formal parameter list. (*name must occur before **name.) For example, if we define a function like this:

def cheeseshop(kind, *arguments, **keywords):
    print("-- Do you have any", kind, "?")
    print("-- I'm sorry, we're all out of", kind)
    for arg in arguments:
        print(arg)
    print("-" * 40)
    for kw in keywords:
        print(kw, ":", keywords[kw])

It could be called like this:

cheeseshop("Limburger", "It's very runny, sir.",
           "It's really very, VERY runny, sir.",
           shopkeeper="Michael Palin",
           client="John Cleese",
           sketch="Cheese Shop Sketch")

and of course it would print:

-- Do you have any Limburger ?
-- I'm sorry, we're all out of Limburger
It's very runny, sir.
It's really very, VERY runny, sir.
----------------------------------------
shopkeeper : Michael Palin
client : John Cleese
sketch : Cheese Shop Sketch

Note that the order in which the keyword arguments are printed is guaranteed to match the order in which they were provided in the function call.

7.3. Arbitrary Argument Lists

Finally, the least frequently used option is to specify that a function can be called with an arbitrary number of arguments. These arguments will be wrapped up in a tuple (see Tuples and Sequences). Before the variable number of arguments, zero or more normal arguments may occur.

def write_multiple_items(file, separator, *args):
    file.write(separator.join(args))

Normally, these variadic arguments will be last in the list of formal parameters, because they scoop up all remaining input arguments that are passed to the function. Any formal parameters which occur after the *args parameter are ‘keyword-only’ arguments, meaning that they can only be used as keywords rather than positional arguments.>>>

>>> def concat(*args, sep="/"):
...     return sep.join(args)
...
>>> concat("earth", "mars", "venus")
'earth/mars/venus'
>>> concat("earth", "mars", "venus", sep=".")
'earth.mars.venus'

7.4. Unpacking Argument Lists

The reverse situation occurs when the arguments are already in a list or tuple but need to be unpacked for a function call requiring separate positional arguments. For instance, the built-in range() function expects separate start and stop arguments. If they are not available separately, write the function call with the * operator to unpack the arguments out of a list or tuple:>>>

>>> list(range(3, 6))            # normal call with separate arguments
[3, 4, 5]
>>> args = [3, 6]
>>> list(range(*args))            # call with arguments unpacked from a list
[3, 4, 5]

In the same fashion, dictionaries can deliver keyword arguments with the ** operator:>>>

>>> def parrot(voltage, state='a stiff', action='voom'):
...     print("-- This parrot wouldn't", action, end=' ')
...     print("if you put", voltage, "volts through it.", end=' ')
...     print("E's", state, "!")
...
>>> d = {"voltage": "four million", "state": "bleedin' demised", "action": "VOOM"}
>>> parrot(**d)
-- This parrot wouldn't VOOM if you put four million volts through it. E's bleedin' demised !

7.5. Lambda Expressions

Small anonymous functions can be created with the lambda keyword. This function returns the sum of its two arguments: lambda a, b: a+b. Lambda functions can be used wherever function objects are required. They are syntactically restricted to a single expression. Semantically, they are just syntactic sugar for a normal function definition. Like nested function definitions, lambda functions can reference variables from the containing scope:>>>

>>> def make_incrementor(n):
...     return lambda x: x + n
...
>>> f = make_incrementor(42)
>>> f(0)
42
>>> f(1)
43

The above example uses a lambda expression to return a function. Another use is to pass a small function as an argument:>>>

>>> pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]
>>> pairs.sort(key=lambda pair: pair[1])
>>> pairs
[(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')]

7.6. Documentation Strings

Here are some conventions about the content and formatting of documentation strings.

The first line should always be a short, concise summary of the object’s purpose. For brevity, it should not explicitly state the object’s name or type, since these are available by other means (except if the name happens to be a verb describing a function’s operation). This line should begin with a capital letter and end with a period.

If there are more lines in the documentation string, the second line should be blank, visually separating the summary from the rest of the description. The following lines should be one or more paragraphs describing the object’s calling conventions, its side effects, etc.

The Python parser does not strip indentation from multi-line string literals in Python, so tools that process documentation have to strip indentation if desired. This is done using the following convention. The first non-blank line after the first line of the string determines the amount of indentation for the entire documentation string. (We can’t use the first line since it is generally adjacent to the string’s opening quotes so its indentation is not apparent in the string literal.) Whitespace “equivalent” to this indentation is then stripped from the start of all lines of the string. Lines that are indented less should not occur, but if they occur all their leading whitespace should be stripped. Equivalence of whitespace should be tested after expansion of tabs (to 8 spaces, normally).

Here is an example of a multi-line docstring:>>>

>>> def my_function():
...     """Do nothing, but document it.
...
...     No, really, it doesn't do anything.
...     """
...     pass
...
>>> print(my_function.__doc__)
Do nothing, but document it.

    No, really, it doesn't do anything.

7.7. Function Annotations

Function annotations are completely optional metadata information about the types used by user-defined functions (see PEP 3107 and PEP 484 for more information).

Annotations are stored in the __annotations__ attribute of the function as a dictionary and have no effect on any other part of the function. Parameter annotations are defined by a colon after the parameter name, followed by an expression evaluating to the value of the annotation. Return annotations are defined by a literal ->, followed by an expression, between the parameter list and the colon denoting the end of the def statement. The following example has a positional argument, a keyword argument, and the return value annotated:>>>

>>> def f(ham: str, eggs: str = 'eggs') -> str:
...     print("Annotations:", f.__annotations__)
...     print("Arguments:", ham, eggs)
...     return ham + ' and ' + eggs
...
>>> f('spam')
Annotations: {'ham': <class 'str'>, 'return': <class 'str'>, 'eggs': <class 'str'>}
Arguments: spam eggs
'spam and eggs'

Python Naming Convention Rules

1. General

  • Avoid using names that are too general or too wordy. Strike a good balance between the two.
  • Bad: data_structure, my_list, info_map, dictionary_for_the_purpose_of_storing_data_representing_word_definitions
  • Good: user_profile, menu_options, word_definitions
  • Don’t be a jackass and name things “O”, “l”, or “I”
  • When using CamelCase names, capitalize all letters of an abbreviation (e.g. HTTPServer)

2. Packages

  • Package names should be all lower case
  • When multiple words are needed, an underscore should separate them
  • It is usually preferable to stick to 1 word names

3. Modules

  • Module names should be all lower case
  • When multiple words are needed, an underscore should separate them
  • It is usually preferable to stick to 1 word names

4. Classes

  • Class names should follow the UpperCaseCamelCase convention
  • Python’s built-in classes, however are typically lowercase words
  • Exception classes should end in “Error”

5. Global (module-level) Variables

  • Global variables should be all lowercase
  • Words in a global variable name should be separated by an underscore

6. Instance Variables

  • Instance variable names should be all lower case
  • Words in an instance variable name should be separated by an underscore
  • Non-public instance variables should begin with a single underscore
  • If an instance name needs to be mangled, two underscores may begin its name

7. Methods

  • Method names should be all lower case
  • Words in an method name should be separated by an underscore
  • Non-public method should begin with a single underscore
  • If a method name needs to be mangled, two underscores may begin its name

8. Method Arguments

  • Instance methods should have their first argument named ‘self’.
  • Class methods should have their first argument named ‘cls’

9. Functions

  • Function names should be all lower case
  • Words in a function name should be separated by an underscore

10. Constants

  • Constant names must be fully capitalized
  • Words in a constant name should be separated by an underscore

Learn Numpy

Numpy is a general-purpose array-processing package. It provides a high-performance multidimensional array object, and tools for working with these arrays. It is the fundamental package for scientific computing with Python.
Besides its obvious scientific uses, Numpy can also be used as an efficient multi-dimensional container of generic data.

Numpy Array

Array in Numpy is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In Numpy, number of dimensions of the array is called rank of the array.A tuple of integers giving the size of the array along each dimension is known as shape of the array. An array class in Numpy is called as ndarray. Elements in Numpy arrays are accessed by using square brackets and can be initialized by using nested Python Lists.

Creating a Numpy Array
Arrays in Numpy can be created by multiple ways, with various number of Ranks, defining the size of the Array. Arrays can also be created with the use of various data types such as lists, tuples, etc. The type of the resultant array is deduced from the type of the elements in the sequences.

Below are some of the basic numpy functions available for the mathematical operation on the data.

1.np.array
2.np.shape
3.np.zeros
4.np.empty
5np.eye

1.np.array(list)-To convert the python list to numpy array.

numpy.array(objectdtype=Nonecopy=Trueorder=’K’subok=Falsendmin=0)

Create an array.

Parameters:object : array_like An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. dtype : data-type, optional The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method. copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtypeorder, etc.). order : {‘K’, ‘A’, ‘C’, ‘F’}, optional Specify the memory layout of the array. If object is not an array, the newly created array will be in C order (row major) unless ‘F’ is specified, in which case it will be in Fortran order (column major). If object is an array the following holds. order no copy copy=True ‘K’ unchanged F & C order preserved, otherwise most similar order ‘A’ unchanged F order if input is F and not C, otherwise C order ‘C’ C order C order ‘F’ F order F order When copy=False and a copy is made for other reasons, the result is the same as if copy=True, with some exceptions for A, see the Notes section. The default order is ‘K’. subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default). ndmin : int, optional Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.
Returns:out : ndarray An array object satisfying the specified requirements.

Below Example of creating numpy array from the single list

import numpy as np
ls=[1,2,34,5]
print("Type of ls=",type(ls))
np_arr=np.array(ls)
print("Printing Numpy Array:",np_arr)
print("Type of numpy Array",type(np_arr))
print("Dimension of numpy array",np_arr.ndim)

Output

“C:\Python 37\python.exe” C:/Users/shakdas/PycharmProjects/untitled/NumpyTest/Sample1.py
Type of ls=
Printing Numpy Array: [ 1 2 34 5]
Type of numpy Array
Dimension of numpy array 1

Process finished with exit code 0

Below Example of creating numpy array form the multiple list

import numpy as np
ls=[1,2,34,5]
ls1=[6,7,8,9]
ls2=[ls,ls1]
print("Type of ls=",type(ls2))
np_arr=np.array(ls2)
print("Printing Numpy Array:",np_arr)
print("Type of numpy Array",type(np_arr))
print("Dimension of numpy array",np_arr.ndim)

Output

“C:\Python 37\python.exe” C:/Users/shakdas/PycharmProjects/untitled/NumpyTest/Sample1.py
Type of ls=
Printing Numpy Array: [[ 1 2 34 5]
[ 6 7 8 9]]
Type of numpy Array
Dimension of numpy array 2

Process finished with exit code 0

2. ndarray.shape

Tuple of array dimensions.

The shape property is usually used to get the current shape of an array, but may also be used to reshape the array in-place by assigning a tuple of array dimensions to it. As with numpy.reshape, one of the new shape dimensions can be -1, in which case its value is inferred from the size of the array and the remaining dimensions. Reshaping an array in-place will fail if a copy is required.

See alsonumpy.reshape similar function ndarray.reshape similar method

import numpy as np
ls=[1,2,34,5]
ls1=[6,7,8,9]
ls2=[ls,ls1]
np_arr=np.array(ls2)
print("Shape of Numpy Array",np_arr.shape)

Output

“C:\Python 37\python.exe” C:/Users/shakdas/PycharmProjects/untitled/NumpyTest/Sample1.py
Shape of Numpy Array (2, 4)

Process finished with exit code 0

3. numpy.zeros (shapedtype=floatorder=’C’)

Return a new array of given shape and type, filled with zeros.

Parameters:shape : int or tuple of ints Shape of the new array, e.g., (2, 3) or 2dtype : data-type, optional The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64order : {‘C’, ‘F’}, optional, default: ‘C’ Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
Returns:out : ndarray Array of zeros with the given shape, dtype, and order.
import numpy as np
np_arr=np.zeros(5)
print(np_arr)
print("Type of numpy array:",np_arr.dtype)
print("Shape of Numpy Array",np_arr.shape)

Output

“C:\Python 37\python.exe” C:/Users/shakdas/PycharmProjects/untitled/NumpyTest/Sample1.py
[0. 0. 0. 0. 0.]
Type of numpy array float64
Shape of Numpy Array (5,)

Process finished with exit code 0

4.numpy.empty(shapedtype=floatorder=’C’)

Return a new array of given shape and type, without initializing entries.

Parameters:shape : int or tuple of int Shape of the empty array, e.g., (2, 3) or 2dtype : data-type, optional Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64order : {‘C’, ‘F’}, optional, default: ‘C’ Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
Returns:out : ndarray Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.

5.numpy.eye (NM=Nonek=0dtype=<class ‘float’>order=’C’)

Return a 2-D array with ones on the diagonal and zeros elsewhere.

Parameters:N : int Number of rows in the output. M : int, optional Number of columns in the output. If None, defaults to Nk : int, optional Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal. dtype : data-type, optional Data-type of the returned array. order : {‘C’, ‘F’}, optional Whether the output should be stored in row-major (C-style) or column-major (Fortran-style) order in memory. New in version 1.14.0.
Returns:I : ndarray of shape (N,M) An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.
import numpy as np
np_arr=np.eye(5)
print(np_arr)
print("Type of numpy array:",np_arr.dtype)
print("Shape of Numpy Array:",np_arr.shape)

Output
[[1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]]
Type of numpy array: float64
Shape of Numpy Array: (5, 5)

Numpy Mathematical Functions

Adding two numpy array

Adding two numpy array is as simple as adding two matrixes by adding the corresponding positions of the elements.

a=np.array([[2,5,6,4],[4,3,3,4]])
print (a)
print ("---------------------------------")
print (a+a)
[[2 5 6 4]
 [4 3 3 4]]
---------------------------------
[[ 4 10 12  8]
 [ 8  6  6  8]]

Substracting two numpy array

import numpy as np
a=np.array([[2,5,6,4],[4,3,3,4]])
b=np.array([[2,4,5,6],[4,5,6,7]])
print (a)
print ("---------------------------------")
print (b)
print ("---------------------------------")
print (a-b)
[[2 5 6 4]
 [4 3 3 4]]
---------------------------------
[[2 4 5 6]
 [4 5 6 7]]
---------------------------------
[[ 0  1  1 -2]
 [ 0 -2 -3 -3]]

Multiplying two numpy array

import numpy as np
a=np.array([[2,5,6,4],[4,3,3,4]])
b=np.array([[2,4,5,6],[4,5,6,7]])
print (a)
print ("---------------------------------")
print (b)
print ("---------------------------------")
print (a*b)
[[2 5 6 4]
 [4 3 3 4]]
---------------------------------
[[2 4 5 6]
 [4 5 6 7]]
---------------------------------
[[ 4 20 30 24]
 [16 15 18 28]]

Dividing two numpy array

import numpy as np
a=np.array([[2,5,6,4],[4,3,3,4]])
b=np.array([[2,4,5,6],[4,5,6,7]])
print (a)
print ("---------------------------------")
print (b)
print ("---------------------------------")
print (a/b)
[[2 5 6 4]
 [4 3 3 4]]
---------------------------------
[[2 4 5 6]
 [4 5 6 7]]
---------------------------------
[[1.         1.25       1.2        0.66666667]
 [1.         0.6        0.5        0.57142857]]

Powring numpy array

import numpy as np
a=np.array([[2,5,6,4],[4,3,3,4]])
b=np.array([[2,4,5,6],[4,5,6,7]])
print (a)
print ("---------------------------------")
print (a**2)
print ("---------------------------------")
print (a**3)
[[2 5 6 4]
 [4 3 3 4]]
---------------------------------
[[ 4 25 36 16]
 [16  9  9 16]]
---------------------------------
[[  8 125 216  64]
 [ 64  27  27  64]]

numpy.arange

numpy.arange([start, ]stop, [step, ]dtype=None)

Return evenly spaced values within a given interval.

Values are generated within the half-open interval [start, stop) (in other words, the interval including start but excluding stop). For integer arguments, the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use numpy.linspace for these cases.

Parameters:start : number, optionalStart of interval. The interval includes this value. The default start value is 0.stop : numberEnd of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out.step : number, optionalSpacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified as a position argument, start must also be given.dtype : dtypeThe type of the output array. If dtype is not given, infer the data type from the other input arguments.
Returns:arange : ndarrayArray of evenly spaced values.For floating point arguments, the length of the result is ceil((stop - start)/step). Because of floating point overflow, this rule may result in the last element of out being greater than stop.
import numpy as np
a=np.arange(0,11)
print (a)
a=np.arange(0,11,2)
print (a)
[ 0  1  2  3  4  5  6  7  8  9 10]
[ 0  2  4  6  8 10]

Unix / Linux – Shell Basic Operators

There are various operators supported by each shell. We will discuss in detail about Bourne shell (default shell) in this chapter.According to the operation performed on the operators it has been classified into the below types.

We will now discuss the following operators −

  • Arithmetic Operators
  • Relational Operators
  • Boolean Operators
  • String Operators
  • File Test Operators

Bourne shell didn’t originally have any mechanism to perform simple arithmetic operations but it uses external programs, either awk or expr.

The following example shows how to add two numbers − 

#!/bin/sh

val=`expr 2 + 2`
echo "Total value : $val"

The above script will generate the following result −

Total value : 4

The following points need to be considered while adding −

  • There must be spaces between operators and expressions. For example, 2+2 is not correct; it should be written as 2 + 2.
  • The complete expression should be enclosed between ‘ ‘, called the backtick.

Arithmetic Operators

The following arithmetic operators are supported by Bourne Shell.

Assume variable a holds 10 and variable b holds 20 then −

OperatorDescriptionExample
+ (Addition)Adds values on either side of the operator`expr $a + $b` will give 30
– (Subtraction)Subtracts right hand operand from left hand operand`expr $a – $b` will give -10
* (Multiplication)Multiplies values on either side of the operator`expr $a \* $b` will give 200
/ (Division)Divides left hand operand by right hand operand`expr $b / $a` will give 2
% (Modulus)Divides left hand operand by right hand operand and returns remainder`expr $b % $a` will give 0
= (Assignment)Assigns right operand in left operanda = $b would assign value of b into a
== (Equality)Compares two numbers, if both are same then returns true.[ $a == $b ] would return false.
!= (Not Equality)Compares two numbers, if both are different then returns true.[ $a != $b ] would return true.

It is very important to understand that all the conditional expressions should be inside square braces with spaces around them, for example [ $a == $b ] is correct whereas, [$a==$b] is incorrect.

All the arithmetical calculations are done using long integers.

Relational Operators

Bourne Shell supports the following relational operators that are specific to numeric values. These operators do not work for string values unless their value is numeric.

For example, following operators will work to check a relation between 10 and 20 as well as in between “10” and “20” but not in between “ten” and “twenty”.

Assume variable a holds 10 and variable b holds 20 then −

OperatorDescriptionExample
-eqChecks if the value of two operands are equal or not; if yes, then the condition becomes true.[ $a -eq $b ] is not true.
-neChecks if the value of two operands are equal or not; if values are not equal, then the condition becomes true.[ $a -ne $b ] is true.
-gtChecks if the value of left operand is greater than the value of right operand; if yes, then the condition becomes true.[ $a -gt $b ] is not true.
-ltChecks if the value of left operand is less than the value of right operand; if yes, then the condition becomes true.[ $a -lt $b ] is true.
-geChecks if the value of left operand is greater than or equal to the value of right operand; if yes, then the condition becomes true.[ $a -ge $b ] is not true.
-leChecks if the value of left operand is less than or equal to the value of right operand; if yes, then the condition becomes true.[ $a -le $b ] is true.

It is very important to understand that all the conditional expressions should be placed inside square braces with spaces around them. For example, [ $a <= $b ] is correct whereas, [$a <= $b] is incorrect.

Boolean Operators

The following Boolean operators are supported by the Bourne Shell.

Assume variable a holds 10 and variable b holds 20 then −

OperatorDescriptionExample
!This is logical negation. This inverts a true condition into false and vice versa.[ ! false ] is true.
-oThis is logical OR. If one of the operands is true, then the condition becomes true.[ $a -lt 20 -o $b -gt 100 ] is true.
-aThis is logical AND. If both the operands are true, then the condition becomes true otherwise false.[ $a -lt 20 -a $b -gt 100 ] is false.

String Operators

The following string operators are supported by Bourne Shell.

Assume variable a holds “abc” and variable b holds “efg” then −

OperatorDescriptionExample
=Checks if the value of two operands are equal or not; if yes, then the condition becomes true.[ $a = $b ] is not true.
!=Checks if the value of two operands are equal or not; if values are not equal then the condition becomes true.[ $a != $b ] is true.
-zChecks if the given string operand size is zero; if it is zero length, then it returns true.[ -z $a ] is not true.
-nChecks if the given string operand size is non-zero; if it is nonzero length, then it returns true.[ -n $a ] is not false.
strChecks if str is not the empty string; if it is empty, then it returns false.[ $a ] is not false.

File Test Operators

We have a few operators that can be used to test various properties associated with a Unix file.

Assume a variable file holds an existing file name “test” the size of which is 100 bytes and has readwrite and execute permission on −

OperatorDescriptionExample
-b fileChecks if file is a block special file; if yes, then the condition becomes true.[ -b $file ] is false.
-c fileChecks if file is a character special file; if yes, then the condition becomes true.[ -c $file ] is false.
-d fileChecks if file is a directory; if yes, then the condition becomes true.[ -d $file ] is not true.
-f fileChecks if file is an ordinary file as opposed to a directory or special file; if yes, then the condition becomes true.[ -f $file ] is true.
-g fileChecks if file has its set group ID (SGID) bit set; if yes, then the condition becomes true.[ -g $file ] is false.
-k fileChecks if file has its sticky bit set; if yes, then the condition becomes true.[ -k $file ] is false.
-p fileChecks if file is a named pipe; if yes, then the condition becomes true.[ -p $file ] is false.
-t fileChecks if file descriptor is open and associated with a terminal; if yes, then the condition becomes true.[ -t $file ] is false.
-u fileChecks if file has its Set User ID (SUID) bit set; if yes, then the condition becomes true.[ -u $file ] is false.
-r fileChecks if file is readable; if yes, then the condition becomes true.[ -r $file ] is true.
-w fileChecks if file is writable; if yes, then the condition becomes true.[ -w $file ] is true.
-x fileChecks if file is executable; if yes, then the condition becomes true.[ -x $file ] is true.
-s fileChecks if file has size greater than 0; if yes, then condition becomes true.[ -s $file ] is true.
-e fileChecks if file exists; is true even if file is a directory but exists.[ -e $file ] is true.