Python
By: Yijiang Zhao & Matthew Qu
This guide assumes no prior programming knowledge and gives a brief introduction to Python, one of the most widely used programming languages. Python is incredibly useful for doing data analysis, especially for large data sets.
This bootcamp will make sure that you are comfortable with basic Python and equip you with the tools to Google and Stack Overflow your way through more projects!
Variables (strings, ints, floats, booleans) & Operations
Variables & Types
A variable is a way to represent data in Python. We declare a variable by giving it a name and a value. Values of the variable can change and can have different types (e.g. integers or strings).
Here is an example of declaring a variable:
Here we have a variable y
and have assigned it the value 3.45
. Variable names must begin with letters.
To print the value of a variable, we pass the variable to a function called print
.
tip
Oftentimes, we might want to print
multiple items. To do so, you can pass multiple arguments into print
, e.g.
Ask you can see, doing this will automatically separate each printed value with a space.
There are four commonly used types of values:
Type | Example | Explanation |
---|---|---|
int | 1 | An integer |
string | 'a string of text' or "more text" | Text, wrapped with '' or "" |
bool | True or False | A boolean value, which is either True or False , or 1 or 0 |
float | 3.14159 | A real number with decimals |
To change the type of a variable, we use the functions int(), str(), bool(), float()
, where you pass in the value that you want to change, and it returns that value as an int
, string
, bool
, or float
.
If you pass a string
into int()
, it must only contain integers and no decimals nor characters.
caution
int()
truncates the input, in that it removes everything after the decimal (which is different from flooring). For example, int(-4.5)
will return -4
.
Operations on Variables
Additionally, you can perform multiple operations on variables.
For operations on integers, doubles, and floats (e.g. the numbers), you can perform basic arithmetic.
For example, if I let
then,
Operation | Example | Value | Type |
---|---|---|---|
Addition | x+y | 9 | int |
Subtraction | x-y | 5 | int |
Division | x / y | 3.5 | float |
Modulo / Remainder | x % y | 1 | int |
Exponent | x**y | 49 | int |
note
Integer division in Python will return a float
if the quotient is not an integer.
Strings
Strings are the data type that are essentially plain text. They are treated as an array, or list, of individual characters.
To declare a string, you must use single or double quotation marks
However, notice that you cannot add in a line break in the middle of your string otherwise an error occurs. To write a multi-line string, you use triple quotation marks, e.g.
We will cover some simple operations with strings, including concatenation and parsing, but for more information, see W3Schools Python String Methods.
Concatenation
Concatenation is act of putting two strings
together. Python has multiple ways to concatenate strings, including using a +
to "add" two strings together:
Alternatively, you can just leave a space between the two strings:
Additionally, if you would like to duplicate your string, you can "multiply" strings, e.g.
Concatenation must only involve strings, otherwise Python raises an error.
Quotation Marks
One tricky thing is that if you use quotation marks to define a string, how do you actually include quotation marks within a string?
There are a couple work arounds. If you define a string using single quotation marks, then everything before the next single quotation mark is included in the string.
Of course, the reverse is true if you define a string with double quotation marks, you can liberally use single quotation marks inside the string.
Alternatively, you can use the escape character \
before each quotation mark in your string, as that indicates that you want to treat the next character literally.
String Indexing
Earlier, we mentioned how Python treats strings like lists of characters. Thus, like lists, you can index into them to get the character at any given index or position. Like most languages, Python begins indexing at 0
. So a list of size n
will have indices from 0
to n-1
.
Other Methods
We'll touch on a few other common string methods.
First, if you want to lowercase or uppercase a string, you can use the functions lower()
and upper()
.
Second, if you want to split a string into an array or list by some delimiter, you can use the function split
which takes in a delimiter, e.g. " "
or ","
, and returns the list of that string split by that delimiter.
note
When using split()
, it will remove the delimiter you used. Additionally, an empty string is a part of the string as well, and will also be included in the list that is returned if your string begins with and / or ends with the delimiter.
Lastly, if you want to replace all of one character, you can use the function replace()
which takes in a string to be replaces, and another string to replace it with.
Lists and List methods
Lists are an easy and flexible way to store different types of information. Since lists are ordered, they can also be accessed via index (e.g. the first item of the list being at index 0
).
To declare a list, use brackets [ ]
to enclose the list and commas ,
to separate each item.
Notice how a single list can contain multiple data types.
To declare an empty list,
The list()
function creates an empty list, but also turns the input into a list.
Here, we will cover some basic info about lists and common list operations.
Accessing Elements in Lists
One way of accessing items within a list is by index.
note
Lists of size n
(e.g. have n
elements) begin at index 0 and go to (n-1)
Additionally, given a list lst
and integer index i
, you can access the i
th element by using brackets. E.g. lst[i]
returns the i
th element in lst
.
You can also use a semi-colon :
to access a subset of items in a list. For example lst[a:b]
returns items from index a
and up to but not including the item at index b
. If a
is not specified, then it assumes a
is index 0
. Similarly, if b
is not specified, then it assumes b
is the index of the last item in the list, or index n-1
.
For example,
In addition to retrieving an item by index, you can also get the index an element appears in using the list method index()
.
note
Note that index()
returns the index of the first time the input appears in the list. E.g. for a list lst
and element a
, lst.index('a')
returns the index of the first 'a'
.
Adding Elements to List
To add an item to a list, you can append items to the end of the list, or insert a value into a specific index.
To append items to a list, use the list method append()
.
warning
Using append()
mutates the list, e.g. it changes the list in place.
To insert an item, you can use insert()
, which gives a list lst
, index i
, and an element elt
, lst.insert(i, elt)
inserts element elt
into lst
at index i
.
Removing Elements from List
To remove an item from a list, you can remove by index or value. Like append()
, these methods will mutate the list in place.
To remove by index,
To remove by element value, you can use lst.remove(item)
which removes the first element of the list lst
whose value matches item
.
List Operations
Additionally, there are multiple operations you can perform on lists.
To concatenate lists, you can use the addition +
operator.
If you want to repeat a list, you can use the multiplcation *
operator.
Similar to getting the length of a string
, to get the length of a list, use the built-in Python function len()
.
To reverse a list's order, lst.reverse()
will reverse all items in lst
in place.
To sort a list, lst.sort()
will sort all items in lst
in place.
Dictionaries
Dictionaries are like named lists, in that they are mutable and can hold values. However, unlike lists, they attach a key, as opposed to an index, to each value. These key-value pairs make up the dictionary. Values can be any data-type (e.g. strings, ints, lists, dictionaries, etc) while keys must be unique and immutable (e.g. strings, ints, etc. but not lists).
An example of a declaration of a dictionary is:
where to get the value 9
, we would access it with the key:
We'll explore more dictionary functions below.
Creating Dictionaries
First, let us declare a dictionary grades
.
To add items to a dictionary, you give a value to a key, and if the key does not yet exist, it creates a new key-value pair. If a key already exists, it resets the value.
Dictionary Operations
There are various functions to be used with dictionaries.
To get a list of keys of the dictionary grades
, use grades.keys()
.
Similarly, to get a list of values of a dictionary grades
, use grades.values()
.
To get a list of the items as tuples, e.g. in key-value pairs, use grades.items()
.
Conditionals
Conditionals are if
statements, which takes some boolean expression (e.g. an expression that evaluates to true or false) and "if" True
, performs an action.
More concisely, in Python syntax, it is:
For example,
and since x == y
, it will print '5 is equal to 5'
.
But what about when x
is not equal to y
? Below, we have a more comprehensive if
statement using elif
(short for "else if") and else
.
Here, we check if x > y
, and if that is not true, then it goes to the next statement in the chain, elif
. If the elif
statement is not true (e.g. x
is not < y
), then it will jump to the next statement (which can be more elif
statements or else
). Finally, the last statement, else
, means if none of the other statements evaluate to True
, perform the action below.
Loops
In Python, whenever you want to iterate over a list, dictionary, string, etc. (any iterable), you can use for loops to perform an action on each element of the iterable.
An example of a for loop would be, if I had some list named students
,
where I chose student
to refer to each element of the list students
(but any valid variable name would have worked as well).
Loop by Element
Elaborating on this, if I have a list students
,
Then the above for loop will print
Loop by Index
If I want to loop through the list by index, I can use range()
,
note
range(a,b,k)
is an incredibly useful function which returns a sequence of numbers from a
up to (but not including) b
in step sizes of k
. If you do not pass in k
, it automatically assumes step sizes of 1
. And if you only pass in one item, it will be b
, and range(b)
will return a sequence of ints from a=0
up to b
in step sizes of k=1
.
For example
Loop by Index and Element
If I want to know both the index and the element, I can use enumerate()
which returns an object similar to a list of tuples (but is not actually a list of tuples).
note
For a given list lst
, enumerate(lst)
will return an enumerate object which pairs each element with a counter, defaulting to the counter being 0
. You can change this counter by passing an some number k
, e.g. enumerate(lst, k)
which will then return (index, element) pairs with indices being the original index (e.g. from 0
to n-1
) plus k
(so now it will be from k + 0
to k + n-1
.
It is important to know that to actually get the list of (index, element) tuples from enumerate
, you must convert the object to a list. For example,
but enumerate(students)
just returns <enumerate object>
.
Loop in Dictionaries
To use for loops with dictionaries, you can use the above mentioned keys()
, values()
, and items()
methods which return lists of keys, values, or key-value pairs.
For example, if I have dictionary grades
The below for loops will print "freshman is grade 9"
and so on, but in two different ways.
Functions
Oftentimes, we will want to perform the same action (with the same chunk of code) in multiple areas. To avoid copy & pasting everything multiple times, we define functions. Functions can take in inputs and return an output.
The syntax for defining a function is:
where square(x)
takes in a variable and returns the value squared. To call this function, we can run
which would then return 4
. Notice how it is not actually another variable or x
. This is because x
is just what the function internally calls the input, and is not necessarily the name of the input outside of the function definition.
important
What happens when the input, lets call it x
, is both defined outside the function and inside the function?
Here x
is defined as a variable outside of the function, and is also defined as the input variable to square
. Python will treat the x
as the last defined variable of that name, e.g. inside of the function definition, x
will refer to the input as that was when x
was last defined, but outside of the function definition, calling x
will return the value 10
as the input variable is only defined within the function definition, and outside of that, x = 10
is the last time x
is defined.
Thus,
Another important thing to note is that if you use functions that mutate any variable (e.g. change it in place) within your function definition, then that variable, as you might have guessed, will be mutated.
For example, below I define two functions which take as input some variable n
and add it onto the end of the list.
Here, extend_0
does not mutate the list as +
returns a value and does not change lst
in place; this function will then return some new list. However, extend_1
uses append()
which does mutate lst
and changes it in place. Additionally, since append()
performs an action and does not have a return value, it will return None
.
note
Whenever there is no specified return value, Python will return None
. Thus, functions that mutate and do not return a value, like append()
or remove()
, or simply perform an action, like print()
, will all return None
.
Modules and Packages
Oftentimes, there are functions that people want to use for different projects and don't want to copy over or rewrite the same function multiple times. This is where imports come in.
Python has an incredibly large amount of available libraries for doing almost anything you could possibly need, from making graphs, to analyzing data, to even implementing neural networks.
To access these libraries we use imports. For example, some common libraries for data analysis are NumPy and Pandas. Thus, to use these libraries, you would
where we have imported the libraries NumPy and Pandas, and gave them a "nickname" so that we don't have to type the entire word each time we call on a function from the library.
To use functions from these libraries, e.g. NumPy's array
function, which takes in a list and returns an ndarray
(NumPy's own array), I would
If I do not want to type the np.array
each time I call the function, or only want to import that function and not the whole library, I can specify that I just want to import array
, e.g.:
and then I can just type array()
each time I call the function.
What Next?
Hopefully this guide served as a helpful introduction to basic Python and Python syntax, and will equip you with the understanding to explore further on your own! If you have any questions or would like to learn more, feel free to get in touch with anyone on HODP Board!
For more advanced topics, see Python Intermediate which goes into more advanced topics like classes, list comprehensions, and more.
For more information on NumPy and Pandas, two of the most popular and commonly used libraries for data analysis, see NumPy + Pandas.