How to work with Python#

Open In Colab

What is Colab?#

Colab, or “Colaboratory”, allows you to write and execute Python in your browser, with

  • Zero configuration required

  • Access to GPUs free of charge

  • Easy sharing

Whether you’re a student, a data scientist or an AI researcher, Colab can make your work easier.

Getting started#

The document you are reading is not a static web page, but an interactive environment called a Colab notebook that lets you write and execute code.

For example, here is a code cell with a short Python script that computes a value, stores it in a variable, and prints the result:

# run the code
seconds_in_a_day = 24 * 60 * 60
seconds_in_a_day
86400

To execute the code in the above cell, select it with a click and then either press the play button to the left of the code, or use the keyboard shortcut “Command/Ctrl+Enter”. To edit the code, just click the cell and start editing.

Variables that you define in one cell can later be used in other cells:

seconds_in_a_week = seconds_in_a_day
seconds_in_a_week
86400

New cell

You can add new cells by using the + CODE and + TEXT buttons that show when you hover between cells. These buttons are also in the toolbar above the notebook where they can be used to add a cell below the currently selected cell.

Resources#

Links work in Colab!

Working woth notebooks in Colab#

Working with data#

  • Downloading the data from Drive and Google Cloud Storage

  • Visualization of the data

Python#

Python is a high-level, general-purpose programming language.

Libraries#

List of the most frequently used libraries:

  • Numpy is the short name for Numerical Python, which is a Python library predominantly used for technical and scientific computing. Its array-oriented computing capabilities make it an essential tool for fields such as linear algebra, statistical analysis, and machine learning.

  • SciPy is a Python library used for scientific and technical computing. It is built on top of NumPy so it has additional functionalities for various scientific computing tasks. Optimization methods, integration, signal and image processing modules, statistics, linear algebra, splines, clustering and much more.

  • Pandas is an open-source data manipulation library for Python. It is built on top of the NumPy library. It introduces two primary data structures Series and DataFrame. Series is a one-dimensional labelled data whereas DataFrame is a two-dimensional labelled data resuming a table.

  • Scikit-learn is a machine-learning library that provides tools for data mining and analysis. It includes lots of machine learning algorithms for different tasks.

  • matplotlib is a data visualization library that allows developers to create static animated and interactive animations in Python. The graphs and plots it produces are extensively used for data visualization.

Part 1 - Basics of Python#

For applied purposes, it is important to master several aspects:

  • Standart functions

  • Variables

  • Basics structures

  • Functions

  • Usage of libraries

Standart functions#

Functions can be called, the function name is followed by parentheses, which specify arguments (on which object the operation will be performed) and parameters (with what conditions)

print()#

A function for displaying content on the screen.
Required: pass at least some object that needs to be printed.

Its full syntax is:

print(*objects, sep=’ ‘, end=’\n’, file=sys.stdout, flush=False)¶

  • objects: one or more objects to output, separated by commas

  • sep: separator between several objects; a space is set as standard

  • end: the end of the line; a newline translation ‘\n’ is set as standard

print(5)
5
print(5, 6, 7, 8)
5 6 7 8
print(5, 6, 7, 8, sep='_')
5_6_7_8
print('October')
October

Task 1#

Print the phrase: “Hello, world!”

# Write a code

Arithmetic operations#

  • Addition: +

  • Subtraction: -

  • Multiplication: *

  • Division: /

  • Exponentiation: **

  • Integer division: //

  • The remainder of the division: %

print(5 + 2)
print(5 - 2)
print(5*2)
print(5 / 2)
print( 5 // 2)
print(5 % 2)
print(5**2)
7
3
10
2.5
2
1
25

Variables and types of variables#

The results of operations and manipulations sometimes need to be saved, so we define variables - named objects to which we assign values.

savings_2017 = 2000
savings_2016 = 1800
increase = ((2000 - 1800) / 1800)*100
# this way you can view the information about the function in detail
round?
print(round(increase), '%', sep='')
11%

The type function allows you to determine what type of variable is given to us or what type of variable we have written.

print(type(savings_2017))
print(type(increase))
<class 'int'>
<class 'float'>
text = 'Monatery Savings in 2017'
print(type(text))
<class 'str'>
x = False
print(type(x))
<class 'bool'>

If you need to change the type of a variable, then the functions are useful:

  • int() - returns an integer variable

  • float() - returns a non-integer variable

  • bool() - returns a Boolean variable

  • str() - returns a string variable

print(savings_2017, type(savings_2017))
savings_2017_new = str(savings_2017)
print(savings_2017_new, type(savings_2017_new))
print('Savings ' +  'in 2017: ' + savings_2017_new)
2000 <class 'int'>
2000 <class 'str'>
Savings in 2017: 2000
print(int(savings_2017_new) - savings_2016)
200

Task 2#

Correct the line of code so that it works:

# print("I started with $" + savings_2016 + " and now have $" + savings_2017_new + ". Not good!")

Lists#

The four data types considered are arranged in such a way that one value is written in one variable:

  • int: an integer

  • float: a number with a fractional part

  • str: text

  • bool: a boolean variable

For example, if we want to record the savings of family members, we can create a number of variables and enter information there

savings1 = 500
savings2 = 1000
savings3 = 746
savings4 = 456

But this is not very convenient, we can write all the information in one variable:

savings_family = [500, 1000, 746, 2000]

This construction is called a list and is a separate data type. The elements of the list can be accessed by the index:

print(savings_family[2])
746
print(savings_family[-1])
2000

List of lists :)

savings_family = [['mom', 500], ['dad', 1000], ['brother', 746], ['me', 2000]]
s = 0
# Цикл, в котором суммируем все накопления
for member in savings_family:
    s = s + member[1]
print(s)
4246
print(savings_family)
[['mom', 500], ['dad', 1000], ['brother', 746], ['me', 2000]]

Task 3#

Make the list called home that contains the name and footage of the rooms in the house (kitchen - 10, bedroom - 12, bathroom - 5). Print the list.

There are many methods (i.e. functions that are unique to lists) for working with lists:

  • list.append(x) adds an item to the end of the list.

  • list.extend(L) extends the list by adding all the elements of the list L to the end.

  • list.insert(i, x) inserts the value of x on the ith element.

  • list.remove(x) deletes the first item in the list that has the value x. ValueError if no such item exists.

  • list.pop([i]) deletes the ith element and returns it. If the index is not specified, the last element is deleted.

  • list.index(x, [start [, end]]) returns the position of the first element with the value x (in this case, the search is conducted from start to end).

  • list.count(x) returns the number of items with the value x.

  • list.sort([key=function]) sorts the list based on the function.

  • list.reverse() reverses the list using.

  • list.copy() is a surface copy of the list.

  • list.clear() clears the list.

# deleting the entry about the last family member
savings_family.pop()
print(savings_family)
[['mom', 500], ['dad', 1000], ['brother', 746]]
# and now let's add new one
savings_family.append(['me', 2000])
# We create a list of two lists, which stores information about the savings of an uncle, 
# who has 4,000 rubles, and the savings of an aunt, who has 10,000 rubles
savings_relatives = [['uncle', 4000], ['aunt', 10000]]

Task 4#

Add list of relatives savings to list of family savings. Print the resulting list.

Task 5#

Count all savings

for member in savings_family:
    #ваш код здесь
    print(member)
['mom', 500]
['dad', 1000]
['brother', 746]
['me', 2000]

Basic language constructions#

Logical expressions#

By analogy with arithmetic expressions, there are logical expressions that can be true or false. A simple logical expression has the form

<arithmetic expression> <comparison sign> <arithmetic expression>.

For example, if we have variables x and y with some values, then the logical expression

x + y < 3y

has x + y as the first arithmetic expression, < (less than) as the comparison sign, and the second arithmetic expression in it is 3y.

Boolean expression Value
<  Less-than
> Greater-than
<= Less than or equal to
>= Greater than or equal to
== Equal
!= Not equal
x = 1 > 2
print(type(x))
print(x)
print(int(x))
<class 'bool'>
False
0

To write a complex logical expression, it is often necessary to use the logical connectives “and”, “or” and “not”. In Python, they are denoted as and, or and not, respectively. The and and or operations are binary, i.e. they must be written between operands, for example x < 3 or y > 2. The operation not is unary and must be written before one operand.

Task 6#

# Set three numbers x, y, z and try different complex logical expressions:

# print(x == y)
# print(x > y and y < z)

Task 7#

Compare whether x is greater than 10. Print the result.

x = 7

Task 8#

Do mom’s savings fall into the range from 300 to 500? Write a boolean expression that checks this condition

savings_mom = 500

Conditional operator#

Logical expressions are most often used in conditional statements.

Syntax in python:
if condition1:
  commands
elif condition2:
  commands
elif condition 3:
  commands
else:
  commands

x = 5
y = 7
if x > y:
    print('X is greater than Y')
    print(x - y)
else:
    print(' ')
    print('Y is not greater than X')
 
Y is not greater than X
# Uncomment next two lines and run the cell
# x = int(input('Введите х '))
# y = int(input('Введите y '))
if x > y:
    print('X is greater than Y')
    print(x - y)
elif y > x:
    print('Y is greater than X')
    print(y - x)
else:
    print('X equals Y')
Y is greater than X
2

Task 9#

The kitchen area is given. If it exceeds 15 square meters, then the line “Kitchen is big!” should appear, otherwise - “Kitchen is small”

room = "kit"
area = 14.0

# if #ваш код здесь:
#     print()
# else:
#     print()

The while loop#

As soon as a line of code is executed, it is “forgotten”, i.e. to repeat this action, you need to explicitly register it again. Cyclic constructions are used to repeat the same actions many times.

while allows you to execute commands as long as the condition is true. After the end of the execution of the block of commands related to while, control returns to the line with the condition and, if it is True, the execution of the block of commands is repeated, and if it is False, the execution of commands written after while continues.

It is important not to accidentally make an endless loop!

while condition1:
  commands

All commands are indented.

temperature = 5
while temperature > 0:
    print(f'Temperature is {temperature}. The weather is OK.')
    temperature = temperature - 1
Temperature is 5. The weather is OK.
Temperature is 4. The weather is OK.
Temperature is 3. The weather is OK.
Temperature is 2. The weather is OK.
Temperature is 1. The weather is OK.

Task 10#

For a given integer N, print out the squares of all the numbers preceding N (including N) in descending order

Sample

Input data:
3

Program output:
9
4
1

N = 5
# Ваш код

For loop#

The range(n) function#

creates an object of the range class, which contains an arithmetic progression in specified ranges and with a certain step

Its syntax is:

  • range(start, stop[, step])

print(range(5))
print(list(range(5)))
print(list(range(2, 7)))
print(list(range(2, 6, 2)))
range(0, 5)
[0, 1, 2, 3, 4]
[2, 3, 4, 5, 6]
[2, 4]

The for loop allows you to iterate through the elements from something iterable.

Its syntax is:
for i in range of changes i:
  commands

All commands are indented

n = 5
for i in range(n + 1):
    print(i**2)
0
1
4
9
16
25
for color in ['red', 'green', 'yellow']:
    print(color, 'apple')
red apple
green apple
yellow apple

Writing functions#

If some operations need to be repeated from time to time and there are no ready-made functions, then you can create your own.

Syntax:
def function name(list of arguments):
  commands
return result of the function

All commands are indented.

The function must be written anywhere before it is first called.

def power(number, p):
    result = number**p
    return result

# Uncomment and try to run the cell
# number = int(input("Число "))
# p = int(input("Степень"))
# print(power(number, p))

Task 11#

Write a function that returns the perimeter of a rectangle using the length of the two sides.

# def rectangle(a, b):
    # Ваш код

a = 4
b = 10
# print(rectangle(a, b))

Working with NumPy#

NumPy is a python module that provides common mathematical and numerical operations in the form of fast functions. They provide functionality that can be compared to the functionality of MatLab. ‘NumPy’ (Numeric Python) provides basic methods for manipulating large arrays and matrices. SciPy (Scientific Python) extends the functionality of numpy with a huge collection of useful algorithms such as minimization, Fourier transform, regression, and other applied mathematical techniques.

To use (any) library, you need to import it to our notebook:

import library name

For further use, the name of the library can be shortened, but then its alias is indicated in the import

import library name as alias

import numpy as np

The main feature of numpy is the array object. Arrays are similar to lists in python, except for the fact that array elements must have the same data type as float and int. With arrays, numerical operations can be performed with a large amount of information many times faster and, most importantly, much more efficiently than with lists.

Creating an array from a list:

a = np.array([1, 4, 5, 8], float)
print(type(a))
<class 'numpy.ndarray'>

All elements can be accessed and manipulated in the same way as with lists:

# try to print element of array
print(a[0:2])
[1. 4.]

Arrays can also be multidimensional. Here is an example of a two-dimensional array (matrix):

a = np.array([[1, 2, 3], [4, 5, 6]], float)
print(a)
[[1. 2. 3.]
 [4. 5. 6.]]
print(a[:,1])
[2. 5.]
print('Array shape:', a.shape)
print('Type of data in the array:', a.dtype)
Array shape: (2, 3)
Type of data in the array: float64

The shape of the array can be changed using the methods reshape, flatten, transpose. Using reshape, you can add or remove dimensions in an array, as well as change the number of elements in each dimension.

a = np.array(range(10), float)
print('Old shape:', a.shape)
a = a.reshape((5, 2))
print('New shape:', a.shape)
Old shape: (10,)
New shape: (5, 2)

transpose is a function that allows you to change the order of the axes.

a = a.transpose()
print('Shape after transpose:', a.shape)
Shape after transpose: (2, 5)

Task 12#

# make array 3-dimentional with shape = (1, 2, 5)

# transpose it that it's shape = (2, 5, 1)

The flatten function allows you to convert an array from a multidimensional array to a one-dimensional one.

a = a.flatten()
print('Shape of the flattened array:', a.shape)
Shape of the flattened array: (10,)

Arithmetic operators in arrays are applied elementwise.

a = np.array([20, 30, 40, 50])
b = np.arange(4)
print('b:', b)
print('a - b:', a - b)
b: [0 1 2 3]
a - b: [20 29 38 47]
print('b^5:', np.power(b, 3))
b^5: [ 0  1  8 27]
print('10 * sin(a):', 10 * np.sin(a))
10 * sin(a): [ 9.12945251 -9.88031624  7.4511316  -2.62374854]
print('Elements of a < 35:', a < 35)
Elements of a < 35: [ True  True False False]

Let’s look at the multiplication of matrices:

A = np.array([[1, 1],
              [0, 1]])
B = np.array([[2, 0],
              [3, 4]])
print('elementwise product:')
print(A * B)

print('matrix product')
print(A @ B)

print('another matrix product')
print(A.dot(B))
elementwise product:
[[2 0]
 [0 4]]
matrix product
[[5 4]
 [3 4]]
another matrix product
[[5 4]
 [3 4]]

In machine learning and data science, sometimes you will have to work with randomly generated data. Various numpy random methods exist to generate random values or a numpy array with random values.

If we initialize the initial conditions with a certain initial value (random.seed), then the same random numbers will always be generated for this initial value. This means that numpy random is deterministic for a given initial value.

np.random.seed(5)

Let’s consider several functions for generating random values:

np.random.normal?
np.random.normal(loc = 1, scale = 1, size = 2)
array([1.44122749, 0.66912985])
# Generate 2-dimentional array from normal distribution
np.random.rand?
np.random.rand(2,2)
array([[0.20671916, 0.91861091],
       [0.48841119, 0.61174386]])

More information about using Numpy can be found in the following tutorial .