Month: November 2016

Deleting thousands of files from Linux directory

Our project team uses a script which creates thousands of files in a folder and we sometime has to manually clean up all those files.

rm command failed saying the list is too long and we had to find another method to do this. rather than writing s shell script and delete the files one by one we used the find command.

Here how we had done it.

>>>ls -lrt| wc -l
 250,002

We have 250K files in this directory and we tried removing them using the rm command.

>>> rm *.env
 ksh: rm: /bin/rm: cannot execute [Argument list too long]

This issue can be easily solved by the find command and we noticed find runs faster in such situations.

>>>find . -name "*.env" -delete

Above listed find command deleted all .env files in the current directory.

Advertisements

Access Oracle from Python

If you are a data engineer you might have to pull data from various sources and relational databases (RDBMs) is one of them. Python has separate modules to connect to each of relational databases (Like Oracle, MySQL, PostgresSQL etc.). Today we will discuss how we can connect and query Oracle database from Python.

I hope you have working Python environment set up already. Here I’m using Python CLI with Python 2.7 for this tutorial.

1. Download and install oracle 12c database.

I have downloaded Oracle 12C database from OTN (otn.oracle.com) and installed in my CentOS 7 laptop. Since it is installed in my laptop, server name is localhost and the database name is pdborcl.localdomain

2. We need to install an python module called cx_Oracle which can help us to connect and extract data from the Oracle instance. It gave me couple of errors when I first tried to install on my machine.

[oracle@ora-c7 ~]$ sudo pip install cx_Oracle
 Collecting cx_Oracle
 Using cached cx_Oracle-5.2.1.tar.gz
 Complete output from command python setup.py egg_info:
 Traceback (most recent call last):
 File "", line 1, in 
 File "/tmp/pip-build-kkqi2h/cx-Oracle/setup.py", line 170, in 
 raise DistutilsSetupError("cannot locate an Oracle software " \
 distutils.errors.DistutilsSetupError: cannot locate an Oracle software installation

----------------------------------------
 Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-kkqi2h/cx-Oracle/

This is because some packages are missing and we need to install them before installing cx_Oracle. Lets install setuptools and ez-setup.

[root@ora-c7 oracle]# pip install --upgrade setuptools
 Collecting setuptools
 Downloading setuptools-28.7.1-py2.py3-none-any.whl (472kB)
 100% |████████████████████████████████| 481kB 857kB/s
 Installing collected packages: setuptools
 Found existing installation: setuptools 20.10.1
 Uninstalling setuptools-20.10.1:
 Successfully uninstalled setuptools-20.10.1
 Successfully installed setuptools-28.7.1

[root@ora-c7 oracle]# pip install ez_setup
 Collecting ez_setup
 Downloading ez_setup-0.9.tar.gz
 Installing collected packages: ez-setup
 Running setup.py install for ez-setup ... done
 Successfully installed ez-setup-0.9

Now lets try to reinstall cx_Oracle.

[root@ora-c7 oracle]# pip install cx_Oracle
 Collecting cx_Oracle
 Using cached cx_Oracle-5.2.1.tar.gz
 Installing collected packages: cx-Oracle
 Running setup.py install for cx-Oracle ... done
 Successfully installed cx-Oracle-5.2.1

Lets try to run a sample program which connects and pull some data. We will connect and get the oracle database version and query a table. This is a simple program we can do much complex processing with Cx_Oracle, I will list them in another blog.

Oracle version I have is 12C and the table has 100 records in it.

[oracle@ora-c7 ~]$ python
 Python 2.7.5 (default, Sep 15 2016, 22:37:39)
 [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import cx_Oracle
 >>> from pprint import pprint
 >>> db=cx_Oracle.connect('******','*****','localhost:1521/pdborcl.localdomain')
 >>> print db.version
 12.1.0.2.0
 >>> cursor = db.cursor()
 >>> cursor.execute('select count(*) from my_tab');
 >>> pprint(cursor.fetchall())
 [(100,)]

OEM Express in 12C

If you are looking for some basic database monitoring, Oracle 12C provides Enterprise Manager Express for that purpose. It is a light weight version of EM with limited features. I use it to monitor the oracle 12c instance installed on my laptop which runs on CentOS 7.0.

For accessing the EM express we have to enable the HTTP and HTTPS port for the database . Here is how you can check the HTTP and HTTPS port settings for your database.

SQL> select dbms_xdb.getHttpPort() from dual;

DBMS_XDB.GETHTTPPORT()
 ----------------------
 8080

SQL> select dbms_xdb_config.getHttpsPort() from dual;

DBMS_XDB_CONFIG.GETHTTPSPORT()
 ------------------------------
 5501

If you want to manually set up the HTTPS or HTTP port you can do that by logging into the database and running the below commands.

SQL> exec dbms_xdb_config.sethttpsport(5501);

or

SQL> exec dbms_xdb_config.sethttpport(8080);

In order to use EM Express, a database user needs to have been granted the EM_EXPRESS_BASIC or EM_EXPRESS_ALL role.  The DBA role includes both the EM_EXPRESS_BASIC and the EM_EXPRESS_ALL roles.

Once this is done use the below URL to access EM Express.

https://localhost:5501/em

Login page.

emexpress1

Home Page.

emexpress2

Performance Home.

emexpress3

It may not have all the OEM features but a nice tool indeed.