Hello, and welcome to Lesson 4 of my tutorial series, “Data Science with Keshav“. To get an overview of what this tutorial series is about, you can check out my another post, Data Science 101. In this part of this tutorial series, we will skip the theories and jump right into the programming section.
You might go through the syllabus and wonder why I skipped some preliminary sections in Calculus and Linear Algebra. It’s just a small twist to our learning path, we will first complete programming sections and in following sections wherever we need concepts of linear algebra and calculus I will introduce useful concepts.
I believe you have your system ready and python installed already. I am assuming you are following along the tutorial series and have the virtualenv installed already too. For a general idea on virtualenv, you can refer to our another article, Python Virtual Environment and Linux User Management, here. We will be using ipython as our workspace. For those of you who are using ubuntu 16.04 you can easily set it up in a virtual environment using the commands below:
$ cd ~
$ virtualenv learningpython
$ source ~/learningpython/bin/activate
(learningpython) $ pip install ipython
(learningpython) $ ipython
In  :
I want you to stay still and follow me along with this post.
Now without any further delay, I would like to introduce you the wonderful programming language. Yes, we are going to learn python and python is awesome. You will soon know why python is awesome. It will be easier for you to contrast features of python with other programming languages if you are familiar with at least one other programming language. If you are a newbie, believe me, python is by far one of the easiest programming languages to code.
Why do we choose python as our programming language? Python stood itself as a versatile programming platform. Python has been widely used in data science and artificial intelligence community. It has huge library support and awesome community who are continuously developing python programming environment and different libraries we will need in our journey of data science.
You might have other different opinions but believe me, we are using python because of its huge community support and its simplicity.
However, in this post, I am assuming that you are quite familiar with at least one another programing language. Having said that, let us start.
Before starting, I will suggest you run following commands in ipython shell and read the output carefully.
In : import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
Cool. You saw some principles. We will deal this in detail.
Okay, what does it take to become expert in any programming language? What will you look into programming language to take its full advantage? The first and foremost things we need to see is a design pattern of any programming. And this is obvious with python programming as well. There are few things you need to peek into, data types, operators, how loops are designed, object-oriented design and what it offers as off the shelf tool which is different than other programming languages.
Python is dynamically typed programming language, this means there is no necessity of declaring data types explicitly, you can just assign a value to a variable and it automatically set variable to that data type.
In : a = int() In : type(a) Out: int
In : a = 2 In : type(a) Out: i
Here in the first section of code we explicitly said ‘a’ is an integer type (little bit odd, why not ‘int a’, I will explain later on), whereas in the second section we directly assign ‘a’ as 2 and this is still okay. This is where you should understand the concepts of dynamic typing. Well, this is an advantage but has one serious limitation, this makes python programming slower but we are ready to compromise in an exchange of simplicity since we now possess computationally powerful machines.
There are other data types to play with as well (few of the native types, I have not included them all).
In : a = 2 In : type(a) Out: int In : b = '2' In : type(b) Out: str In : d = 2.0 In : type(d) Out: float In : e = True In : type(e) Out: bool
Now, there is another thing you need to understand. Everything you see in python are objects. Well, if you don’t know what an object is, for now just think of object as something that represents real-world data. Like Door can be taken as object with data like what is it made of, its size, and its status opened or closed and its function. If you already have somewhat knowledge of OOP, then I must mention that an object is an instance of a class. Ok, so let me continue, everything you see in python is an object of some inbuilt class.
In : a = 2 In : a.__class__ Out: int
You see, a is an instance of Integer class. This instance is stored in memory and has some address
In : id(a) Out: 10919456
Here, the output of id(a) gives an address where that instance of class resides. Let me do something new
In : x = 2 In : id(x) Out: 10919456 In : x = 3 In : id(x) Out: 10919488
If you have some programming background, this might confuse you, here x points to two different addresses for a different value. I leave this to you to do some research and comment why this happens and is it always true? Above behavior is related with the immutability of data type. Integer is one of the immutable types and well there are other data types which are mutable.
So let’s talk about few more data types you need to know. I leave you to research if these to be discussed data types are mutable or immutable.
STRING data types allow you to work with character arrays, which means collections of characters, like my name “Keshav Bhandari” can be stored in a variable or my gender “M” can be stored in another variable, and these variable are of types string.
In : name = "Keshav Bhandari" In : gender = "M" In : name.__class__ Out: str In : gender.__class__ Out: str In : type(name) Out: str In : type(gender) Out: str
There are lots of things we can do in a string, in python. I suggest you type a variable name and dot and press tab to see all functions what you can do
I will not be going through all these functions but I leave it upto your curious minds. There is however a way how to try all these methods, I suggest you to do following.
Suppose you want to use lstrip() method, first you will see its documentation
In : name = 'Keshav Bhandari' In : ?name.lstrip() Docstring: S.lstrip([chars]) -> str Return a copy of the string S with leading whitespace removed. If chars is given and not None, remove characters in chars instead. Type: builtin_function_or_method
Hmm, this means if I have a string like ” Keshav Bhandari”, and I want to remove all starting white spaces I need lstrip. Let’s try.
In : name = ' Keshav Bhandari' In : name.lstrip() Out: 'Keshav Bhandari'
Let’s make this more interesting, what if want to remove ‘Kes’ from ‘Keshav Bhandari’
In : name Out: 'Keshav Bhandari' In : name.lstrip('kes') Out: 'hav bhandari'
Hmm, pretty handy. I think now you must try all, and if you feel any difficulty please write in comments. Next, we will see some slicing techniques in string. This is useful when you need a certain portion of string according to index.
In : a = 'this is a short string' In : a #gives first elements Out: 't' In : a[-1] #gives last element Out: 'g' In : a[2:7] #gives elements from index 2 to 6 included Out: 'is is' In : a[1:10:2] #gives elements from index 1 to 9 included with offset or jumps of 2 Out: 'hsi ' In : a[::-1] #gives reverse string Out: 'gnirts trohs a si siht'
How that last code works a[::-1]?
Now I want to end this article. I will continue the second article with more other data types.