Basic Python Datatypes#
Python is a dynamically typed language – this means that you don’t need to specify ahead of time what kind of data you are going to store in a variable. Nevertheless, there are some core datatypes that we need to become familiar with as we use the language.
The first set of datatypes are similar to those found in other languages (like C/C++ and Fortran): floating point numbers, integers, and strings.
Floating point is essential for computational science. A great introduction to floating point and its limitations is: What every computer scientist should know about floating-point arithmetic by D. Goldberg.
The next set of datatypes are containers. In python, unlike some languages, these are built into the language and make it very easy to do complex operations. We’ll look at these later.
Some examples come from the python tutorial: http://docs.python.org/3/tutorial/
integers#
Integers are numbers without a decimal point. They can be positive or negative. Most programming languages use a finite-amount of memory to store a single integer, but in python will expand the amount of memory as necessary to store large integers.
The basic operators, +
, -
, *
, and /
work with integers
2+2+3
7
2*-4
-8
Note: integer division is one place where python 2 and python 3 different
In python 3.x, dividing 2 integers results in a float. In python 2.x, dividing 2 integers results in an integer. The latter is consistent with many strongly-typed programming languages (like Fortran or C), since the data-type of the result is the same as the inputs, but the former is more inline with our expectations
1/2
0.5
To get an integer result, we can use the // operator.
1//2
0
Python is a dynamically-typed language—this means that we do not need to declare the datatype of a variable before initializing it.
Here we’ll create a variable (think of it as a descriptive label that can refer to some piece of data). The =
operator assigns a value to a variable.
a = 1
b = 2
Functions operate on variables and return a result. Here, print()
will output to the screen.
a + b
3
a * b
2
Note that variable names are case sensitive, so a and A are different
A = 2048
print(a, A)
1 2048
Here we initialize 3 variable all to 0
, but these are still distinct variables, so we can change one without affecting the others.
x = y = z = 0
print(x, y, z)
0 0 0
z = 1
z
1
Python has some built in help (and Jupyter/ipython has even more)
try doing:
help(x)
alternatively, try:
x?
(this only works in Jupyter)
Another function, type()
returns the data type of a variable
type(x)
int
Note in languages like Fortran and C, you specify the amount of memory an integer can take (usually 2 or 4 bytes). This puts a restriction on the largest size integer that can be represented. Python will adapt the size of the integer so you don’t overflow
a = 12345678901234567890123456789012345123456789012345678901234567890
print(a)
print(a.bit_length())
print(type(a))
12345678901234567890123456789012345123456789012345678901234567890
213
<class 'int'>
floating point#
when operating with both floating point and integers, the result is promoted to a float.
1. + 2
3.0
but note the special integer division operator
1.//2
0.0
It is important to understand that since there are infinitely many real numbers between any two bounds, on a computer we have to approximate this by a finite number. There is an IEEE standard for floating point that pretty much all languages and processors follow.
The means two things
not every real number will have an exact representation in floating point
there is a finite precision to numbers – below this we lose track of differences (this is usually called roundoff error)
On our course website, I posted a link to a paper, What every computer scientist should know about floating-point arithmetic – this is a great reference on understanding how a computer stores numbers.
Consider the following expression, for example:
0.3/0.1 - 3
-4.440892098500626e-16
Here’s another example: The number 0.1 cannot be exactly represented on a computer. In our print, we use a format specifier (the stuff inside of the {}) to ask for more precision to be shown:
a = 0.1
print("{:30.20}".format(a))
0.10000000000000000555
we can ask python to report the limits on floating point
import sys
sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
Note that this says that we can only store numbers between 2.2250738585072014e-308 and 1.7976931348623157e+308
We also see that the precision is 2.220446049250313e-16 (this is commonly called machine epsilon). To see this, consider adding a small number to 1.0. We’ll use the equality operator (==
) to test if two numbers are equal:
Quick Exercise
Define two variables, \(a = 1\), and \(e = 10^{-16}\).
Now define a third variable, b = a + e
We can use the python ==
operator to test for equality. What do you expect b == a
to return? run it an see if it agrees with your guess.
modules#
The core python language is extended by a standard library that provides additional functionality. These added pieces are in the form of modules that we can import into our python session (or program).
The math
module provides functions that do the basic mathematical operations as well as provide constants (note there is a separate cmath
module for complex numbers).
In python, you import
a module. The functions are then defined in a separate namespace—this is a separate region that defines names and variables, etc. A variable in one namespace can have the same name as a variable in a different namespace, and they don’t clash. You use the “.
” operator to access a member of a namespace.
By default, when you type stuff into the python interpreter or here in the Jupyter notebook, or in a script, it is in its own default namespace, and you don’t need to prefix any of the variables with a namespace indicator.
import math
math
provides the value of pi
math.pi
3.141592653589793
This is distinct from any variable pi
we might define here
pi = 3
print(pi, math.pi)
3 3.141592653589793
Note here that pi
and math.pi
are distinct from one another—they are in different namespaces.
floating point operations#
The same operators, +
, -
, *
, /
work are usual for floating point numbers. To raise an number to a power, we use the **
operator (this is the same as Fortran)
R = 2.0
math.pi * R**2
12.566370614359172
operator precedence follows that of most languages. See
https://docs.python.org/3/reference/expressions.html#operator-precedence
in order of precedence:
quantites in
()
slicing, calls, subscripts
exponentiation (
**
)+x
,-x
,~x
*
,@
,/
,//
,%
+
,-
(after this are bitwise operations and comparisons)
Parentheses can be used to override the precedence.
Quick Exercise
Consider the following expressions. Using the ideas of precedence, think about what value will result, then try it out in the cell below to see if you were right.
1 + 3*2**2
1 + (3*2)**2
2**3**2
The math module provides a lot of the standard math functions we might want to use.
For the trig functions, the expectation is that the argument to the function is in radians—you can use math.radians()
to convert from degrees to radians, ex:
math.cos(math.radians(45))
0.7071067811865476
Notice that in that statement we are feeding the output of one function (math.radians()
) into a second function, math.cos()
When in doubt, as for help to discover all of the things a module provides:
help(math.sin)
Help on built-in function sin in module math:
sin(x, /)
Return the sine of x (measured in radians).
complex numbers#
python uses ‘j
’ to denote the imaginary unit
1.0 + 2j
(1+2j)
a = 1j
b = 3.0 + 2.0j
print(a + b)
print(a * b)
(3+3j)
(-2+3j)
we can use abs()
to get the magnitude and separately get the real or imaginary parts
print("magnitude: ", abs(b))
print("real part: ", a.real)
print("imag part: ", a.imag)
magnitude: 3.605551275463989
real part: 0.0
imag part: 1.0
strings#
python doesn’t care if you use single or double quotes for strings:
a = "this is my string"
b = 'another string'
print(a)
print(b)
this is my string
another string
Many of the usual mathematical operators are defined for strings as well. For example to concatenate or duplicate:
a + b
'this is my stringanother string'
a + ". " + b
'this is my string. another string'
a * 2
'this is my stringthis is my string'
There are several escape codes that are interpreted in strings. These start with a backwards-slash, \
. E.g., you can use \n
for new line
a = a + "\n"
print(a)
this is my string
Quick Exercise
The input()
function can be used to ask the user for input.
Use
help(input)
to see how it works.Write code to ask for input and store the result in a variable.
input()
will return a string.Use the
float()
function to convert a number entered as input to a floating point variable.Check to see if the conversion worked using the
type()
function.
“”” can enclose multiline strings. This is useful for docstrings at the start of functions (more on that later…)
c = """
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt
in culpa qui officia deserunt mollit anim id est laborum."""
print(c)
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt
in culpa qui officia deserunt mollit anim id est laborum.
a raw string does not replace escape sequences (like \n). Just put a r
before the first quote:
d = r"this is a raw string\n"
d
'this is a raw string\\n'
slicing is used to access a portion of a string.
slicing a string can seem a bit counterintuitive if you are coming from Fortran. The trick is to think of the index as representing the left edge of a character in the string. When we do arrays later, the same will apply.
Also note that python (like C) uses 0-based indexing
Negative indices count from the right.
a = "this is my string"
print(a)
print(a[5:7])
print(a[0])
print(d)
print(d[-2])
this is my string
is
t
this is a raw string\n
\
Quick Exercise:
Strings have a lot of methods (functions that know how to work with a particular datatype, in this case strings). A useful method is .find()
. For a string a
,
a.find(s)
will return the index of the first occurrence of s
.
For our string c
above, find the first .
(identifying the first full sentence), and print out just the first sentence in c
using this result
there are also a number of methods and functions that work with strings. Here are some examples:
print(a.replace("this", "that"))
print(len(a))
print(a.strip()) # Also notice that strip removes the \n
print(a.strip()[-1])
that is my string
17
this is my string
g
Note that our original string, a
, has not changed. In python, strings are immutable. Operations on strings return a new string.
a
'this is my string'
type(a)
str
As usual, ask for help to learn more:
#help(str)
We can format strings when we are printing to insert quantities in particular places in the string. A {}
serves as a placeholder for a quantity and is replaced using the .format()
method:
a = 1
b = 2.0
c = "test"
print("a = {}; b = {}; c = {}".format(a, b, c))
a = 1; b = 2.0; c = test
But the more modern way to do this is to use f-strings
print(f"a = {a}; b = {b}; c = {c}")
a = 1; b = 2.0; c = test
Note the f
preceding the starting "