How to work with Python#
What is Colab?#
Colab, or “Colaboratory”, allows you to write and execute Python in your browser, with
Zero configuration required
Access to GPUs free of charge
Easy sharing
Whether you’re a student, a data scientist or an AI researcher, Colab can make your work easier.
Getting started#
The document you are reading is not a static web page, but an interactive environment called a Colab notebook that lets you write and execute code.
For example, here is a code cell with a short Python script that computes a value, stores it in a variable, and prints the result:
# run the code
seconds_in_a_day = 24 * 60 * 60
seconds_in_a_day
86400
To execute the code in the above cell, select it with a click and then either press the play button to the left of the code, or use the keyboard shortcut “Command/Ctrl+Enter”. To edit the code, just click the cell and start editing.
Variables that you define in one cell can later be used in other cells:
seconds_in_a_week = seconds_in_a_day
seconds_in_a_week
86400
New cell
You can add new cells by using the + CODE and + TEXT buttons that show when you hover between cells. These buttons are also in the toolbar above the notebook where they can be used to add a cell below the currently selected cell.
Resources#
Links work in Colab!
Working woth notebooks in Colab#
General information about Colaboratory
Markdown guide
Importing libraries
Working with data#
Downloading the data from Drive and Google Cloud Storage
Visualization of the data
Python#
Python is a high-level, general-purpose programming language.
Libraries#
List of the most frequently used libraries:
Numpy is the short name for Numerical Python, which is a Python library predominantly used for technical and scientific computing. Its array-oriented computing capabilities make it an essential tool for fields such as linear algebra, statistical analysis, and machine learning.
SciPy is a Python library used for scientific and technical computing. It is built on top of NumPy so it has additional functionalities for various scientific computing tasks. Optimization methods, integration, signal and image processing modules, statistics, linear algebra, splines, clustering and much more.
Pandas is an open-source data manipulation library for Python. It is built on top of the NumPy library. It introduces two primary data structures Series and DataFrame. Series is a one-dimensional labelled data whereas DataFrame is a two-dimensional labelled data resuming a table.
Scikit-learn is a machine-learning library that provides tools for data mining and analysis. It includes lots of machine learning algorithms for different tasks.
matplotlib is a data visualization library that allows developers to create static animated and interactive animations in Python. The graphs and plots it produces are extensively used for data visualization.
Part 1 - Basics of Python#
For applied purposes, it is important to master several aspects:
Standart functions
Variables
Basics structures
Functions
Usage of libraries
Standart functions#
Functions can be called, the function name is followed by parentheses, which specify arguments (on which object the operation will be performed) and parameters (with what conditions)
print()#
A function for displaying content on the screen.
Required: pass at least some object that needs to be printed.
Its full syntax is:
print(*objects, sep=’ ‘, end=’\n’, file=sys.stdout, flush=False)¶
objects: one or more objects to output, separated by commas
sep: separator between several objects; a space is set as standard
end: the end of the line; a newline translation ‘\n’ is set as standard
print(5)
5
print(5, 6, 7, 8)
5 6 7 8
print(5, 6, 7, 8, sep='_')
5_6_7_8
print('October')
October
Task 1#
Print the phrase: “Hello, world!”
# Write a code
Arithmetic operations#
Addition: +
Subtraction: -
Multiplication: *
Division: /
Exponentiation: **
Integer division: //
The remainder of the division: %
print(5 + 2)
print(5 - 2)
print(5*2)
print(5 / 2)
print( 5 // 2)
print(5 % 2)
print(5**2)
7
3
10
2.5
2
1
25
Variables and types of variables#
The results of operations and manipulations sometimes need to be saved, so we define variables - named objects to which we assign values.
savings_2017 = 2000
savings_2016 = 1800
increase = ((2000 - 1800) / 1800)*100
# this way you can view the information about the function in detail
round?
print(round(increase), '%', sep='')
11%
The type function allows you to determine what type of variable is given to us or what type of variable we have written.
print(type(savings_2017))
print(type(increase))
<class 'int'>
<class 'float'>
text = 'Monatery Savings in 2017'
print(type(text))
<class 'str'>
x = False
print(type(x))
<class 'bool'>
If you need to change the type of a variable, then the functions are useful:
int() - returns an integer variable
float() - returns a non-integer variable
bool() - returns a Boolean variable
str() - returns a string variable
print(savings_2017, type(savings_2017))
savings_2017_new = str(savings_2017)
print(savings_2017_new, type(savings_2017_new))
print('Savings ' + 'in 2017: ' + savings_2017_new)
2000 <class 'int'>
2000 <class 'str'>
Savings in 2017: 2000
print(int(savings_2017_new) - savings_2016)
200
Task 2#
Correct the line of code so that it works:
# print("I started with $" + savings_2016 + " and now have $" + savings_2017_new + ". Not good!")
Lists#
The four data types considered are arranged in such a way that one value is written in one variable:
int: an integer
float: a number with a fractional part
str: text
bool: a boolean variable
For example, if we want to record the savings of family members, we can create a number of variables and enter information there
savings1 = 500
savings2 = 1000
savings3 = 746
savings4 = 456
But this is not very convenient, we can write all the information in one variable:
savings_family = [500, 1000, 746, 2000]
This construction is called a list and is a separate data type. The elements of the list can be accessed by the index:
print(savings_family[2])
746
print(savings_family[-1])
2000
List of lists :)
savings_family = [['mom', 500], ['dad', 1000], ['brother', 746], ['me', 2000]]
s = 0
# Цикл, в котором суммируем все накопления
for member in savings_family:
s = s + member[1]
print(s)
4246
print(savings_family)
[['mom', 500], ['dad', 1000], ['brother', 746], ['me', 2000]]
Task 3#
Make the list called home
that contains the name and footage of the rooms in the house (kitchen - 10, bedroom - 12, bathroom - 5). Print the list.
There are many methods (i.e. functions that are unique to lists) for working with lists:
list.append(x)
adds an item to the end of the list.list.extend(L)
extends the list by adding all the elements of the listL
to the end.list.insert(i, x)
inserts the value ofx
on thei
th element.list.remove(x)
deletes the first item in the list that has the valuex
.ValueError
if no such item exists.list.pop([i])
deletes thei
th element and returns it. If the index is not specified, the last element is deleted.list.index(x, [start [, end]])
returns the position of the first element with the valuex
(in this case, the search is conducted from start to end).list.count(x)
returns the number of items with the valuex
.list.sort([key=function])
sorts the list based on the function.list.reverse()
reverses the list using.list.copy()
is a surface copy of the list.list.clear()
clears the list.
# deleting the entry about the last family member
savings_family.pop()
print(savings_family)
[['mom', 500], ['dad', 1000], ['brother', 746]]
# and now let's add new one
savings_family.append(['me', 2000])
# We create a list of two lists, which stores information about the savings of an uncle,
# who has 4,000 rubles, and the savings of an aunt, who has 10,000 rubles
savings_relatives = [['uncle', 4000], ['aunt', 10000]]
Task 4#
Add list of relatives savings to list of family savings. Print the resulting list.
Task 5#
Count all savings
for member in savings_family:
#ваш код здесь
print(member)
['mom', 500]
['dad', 1000]
['brother', 746]
['me', 2000]
Basic language constructions#
Logical expressions#
By analogy with arithmetic expressions, there are logical expressions that can be true or false. A simple logical expression has the form
<arithmetic expression> <comparison sign> <arithmetic expression>
.
For example, if we have variables x
and y
with some values, then the logical expression
x + y < 3y
has x + y
as the first arithmetic expression, <
(less than) as the comparison sign, and the second arithmetic expression in it is 3y
.
Boolean expression | Value |
---|---|
< | Less-than |
> | Greater-than |
<= Less than or equal to | |
>= | Greater than or equal to |
== | Equal |
!= | Not equal |
x = 1 > 2
print(type(x))
print(x)
print(int(x))
<class 'bool'>
False
0
To write a complex logical expression, it is often necessary to use the logical connectives “and”, “or” and “not”. In Python, they are denoted as and
, or
and not
, respectively. The and
and or
operations are binary, i.e. they must be written between operands, for example x < 3 or y > 2
. The operation not
is unary and must be written before one operand.
Task 6#
# Set three numbers x, y, z and try different complex logical expressions:
# print(x == y)
# print(x > y and y < z)
Task 7#
Compare whether x
is greater than 10
. Print the result.
x = 7
Task 8#
Do mom’s savings fall into the range from 300 to 500? Write a boolean expression that checks this condition
savings_mom = 500
Conditional operator#
Logical expressions are most often used in conditional statements.
Syntax in python:
if condition1:
commands
elif condition2:
commands
elif condition 3:
commands
else:
commands
x = 5
y = 7
if x > y:
print('X is greater than Y')
print(x - y)
else:
print(' ')
print('Y is not greater than X')
Y is not greater than X
# Uncomment next two lines and run the cell
# x = int(input('Введите х '))
# y = int(input('Введите y '))
if x > y:
print('X is greater than Y')
print(x - y)
elif y > x:
print('Y is greater than X')
print(y - x)
else:
print('X equals Y')
Y is greater than X
2
Task 9#
The kitchen area is given. If it exceeds 15 square meters, then the line “Kitchen is big!” should appear, otherwise - “Kitchen is small”
room = "kit"
area = 14.0
# if #ваш код здесь:
# print()
# else:
# print()
The while loop#
As soon as a line of code is executed, it is “forgotten”, i.e. to repeat this action, you need to explicitly register it again. Cyclic constructions are used to repeat the same actions many times.
while
allows you to execute commands as long as the condition is true. After the end of the execution of the block of commands related to while
, control returns to the line with the condition and, if it is True, the execution of the block of commands is repeated, and if it is False, the execution of commands written after while
continues.
It is important not to accidentally make an endless loop!
while condition1:
commands
All commands are indented.
temperature = 5
while temperature > 0:
print(f'Temperature is {temperature}. The weather is OK.')
temperature = temperature - 1
Temperature is 5. The weather is OK.
Temperature is 4. The weather is OK.
Temperature is 3. The weather is OK.
Temperature is 2. The weather is OK.
Temperature is 1. The weather is OK.
Task 10#
For a given integer N, print out the squares of all the numbers preceding N (including N) in descending order
Sample
Input data:
3
Program output:
9
4
1
N = 5
# Ваш код
For loop#
The range(n)
function#
creates an object of the range class, which contains an arithmetic progression in specified ranges and with a certain step
Its syntax is:
range(start, stop[, step])
print(range(5))
print(list(range(5)))
print(list(range(2, 7)))
print(list(range(2, 6, 2)))
range(0, 5)
[0, 1, 2, 3, 4]
[2, 3, 4, 5, 6]
[2, 4]
The for
loop allows you to iterate through the elements from something iterable.
Its syntax is:
for i in range of changes i:
commands
All commands are indented
n = 5
for i in range(n + 1):
print(i**2)
0
1
4
9
16
25
for color in ['red', 'green', 'yellow']:
print(color, 'apple')
red apple
green apple
yellow apple
Writing functions#
If some operations need to be repeated from time to time and there are no ready-made functions, then you can create your own.
Syntax:
def function name(list of arguments):
commands
return result of the function
All commands are indented.
The function must be written anywhere before it is first called.
def power(number, p):
result = number**p
return result
# Uncomment and try to run the cell
# number = int(input("Число "))
# p = int(input("Степень"))
# print(power(number, p))
Task 11#
Write a function that returns the perimeter of a rectangle using the length of the two sides.
# def rectangle(a, b):
# Ваш код
a = 4
b = 10
# print(rectangle(a, b))
Working with NumPy#
NumPy is a python module that provides common mathematical and numerical operations in the form of fast functions. They provide functionality that can be compared to the functionality of MatLab. ‘NumPy’ (Numeric Python) provides basic methods for manipulating large arrays and matrices. SciPy
(Scientific Python) extends the functionality of numpy with a huge collection of useful algorithms such as minimization, Fourier transform, regression, and other applied mathematical techniques.
To use (any) library, you need to import it to our notebook:
import library name
For further use, the name of the library can be shortened, but then its alias is indicated in the import
import library name as alias
import numpy as np
The main feature of numpy is the array
object. Arrays are similar to lists in python, except for the fact that array elements must have the same data type as float
and int
. With arrays, numerical operations can be performed with a large amount of information many times faster and, most importantly, much more efficiently than with lists.
Creating an array from a list:
a = np.array([1, 4, 5, 8], float)
print(type(a))
<class 'numpy.ndarray'>
All elements can be accessed and manipulated in the same way as with lists:
# try to print element of array
print(a[0:2])
[1. 4.]
Arrays can also be multidimensional. Here is an example of a two-dimensional array (matrix):
a = np.array([[1, 2, 3], [4, 5, 6]], float)
print(a)
[[1. 2. 3.]
[4. 5. 6.]]
print(a[:,1])
[2. 5.]
print('Array shape:', a.shape)
print('Type of data in the array:', a.dtype)
Array shape: (2, 3)
Type of data in the array: float64
The shape of the array can be changed using the methods reshape
, flatten
, transpose
. Using reshape
, you can add or remove dimensions in an array, as well as change the number of elements in each dimension.
a = np.array(range(10), float)
print('Old shape:', a.shape)
a = a.reshape((5, 2))
print('New shape:', a.shape)
Old shape: (10,)
New shape: (5, 2)
transpose
is a function that allows you to change the order of the axes.
a = a.transpose()
print('Shape after transpose:', a.shape)
Shape after transpose: (2, 5)
Task 12#
# make array 3-dimentional with shape = (1, 2, 5)
# transpose it that it's shape = (2, 5, 1)
The flatten
function allows you to convert an array from a multidimensional array to a one-dimensional one.
a = a.flatten()
print('Shape of the flattened array:', a.shape)
Shape of the flattened array: (10,)
Arithmetic operators in arrays are applied elementwise.
a = np.array([20, 30, 40, 50])
b = np.arange(4)
print('b:', b)
print('a - b:', a - b)
b: [0 1 2 3]
a - b: [20 29 38 47]
print('b^5:', np.power(b, 3))
b^5: [ 0 1 8 27]
print('10 * sin(a):', 10 * np.sin(a))
10 * sin(a): [ 9.12945251 -9.88031624 7.4511316 -2.62374854]
print('Elements of a < 35:', a < 35)
Elements of a < 35: [ True True False False]
Let’s look at the multiplication of matrices:
A = np.array([[1, 1],
[0, 1]])
B = np.array([[2, 0],
[3, 4]])
print('elementwise product:')
print(A * B)
print('matrix product')
print(A @ B)
print('another matrix product')
print(A.dot(B))
elementwise product:
[[2 0]
[0 4]]
matrix product
[[5 4]
[3 4]]
another matrix product
[[5 4]
[3 4]]
In machine learning and data science, sometimes you will have to work with randomly generated data. Various numpy
random
methods
exist to generate random values or a numpy
array with random values.
If we initialize the initial conditions with a certain initial value (random.seed
), then the same random numbers will always be generated for this initial value. This means that numpy
random
is deterministic for a given initial value.
np.random.seed(5)
Let’s consider several functions for generating random values:
np.random.normal?
np.random.normal(loc = 1, scale = 1, size = 2)
array([1.44122749, 0.66912985])
# Generate 2-dimentional array from normal distribution
np.random.rand?
np.random.rand(2,2)
array([[0.20671916, 0.91861091],
[0.48841119, 0.61174386]])
More information about using Numpy
can be found in the following tutorial .