Skip to content

Learn about our organization's purpose, values, and history that define who we are and how we make a difference.

Who we are

why-we-are

Discover how the Mastech InfoTrellis ecosystem is enabling customers to make well-informed decisions faster than ever and how we stand apart in the industry.

Delve into our wealth of insights, research, and expertise across various resources, and uncover our unique perspectives.

Thrive in a supportive and inclusive work environment, explore diverse career options, grow your skills, and be a part of our mission to excellence.

Table of Content

DataFrame in an Oracle Machine Learning Notebook

Introduction

Oracle Machine Learning Notebooks are very powerful and allow you to leverage the capabilities of Python combined with the ease of accessing data directly from Oracle Autonomous Database. Oracle provides the oml library which can be used to access and manipulate data. The oml library contains a DataFrame object.

Exception when iterating through oml DataFrame

When trying to iterate through the oml DataFrame in the code below:

import oml
oml.connect()
my_oml_df = oml.sync(view='TRANSACTIONS')
for ind in range(len(my_oml_df)):
    print(my_oml_df.loc[ind, 'POS_DATE'])

The following exception will be thrown:

Fail to execute line 5: print(my_oml_df.loc[ind, 'POS_DATE'])
Traceback (most recent call last):
File "/tmp/python2948065079680563815/zeppelin_python.py", line 215, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 5, in <module>
AttributeError: 'DataFrame' object has no attribute 'loc'

For those familiar with Pandas DataFrames, the above exception may be confusing. The important point is that oml DataFrame is different from Pandas DataFrame in many ways.

Converting oml DataFrame to Pandas DataFrame

Most of us are very familiar with Pandas DataFrame and need a way to convert to a Pandas DataFrame to do all our normal tasks.

Enter the pull method from the oml library. The pull method creates a local Python object that contains a copy of data referenced by the oml object. In the case of oml.DataFrame, a pd.DataFrame will be created.

Please note the code below will now work because a Pandas DataFrame is used in the for loop.

import oml
import pandas as pd
oml.connect()
my_oml_df = oml.sync(view='TRANSACTIONS')
my_pandas_df = my_oml_df.pull()
for ind in range(len(my_pandas_df)):
    print(my_pandas_df.loc[ind, 'POS_DATE'])

So, if you are getting errors like either of the following:

AttributeError: 'DataFrame' object has no attribute 'loc'
AttributeError: 'DataFrame' object has no attribute 'apply'

You are most likely trying to use an oml DataFrame instead of a Pandas DataFrame. The good news is that all you need to do is add the following line:

my_pandas_df = my_oml_df.pull()
avatar

Brian Houlihan

Enterprise Consulting Architect

With more than 25 years of extensive experience, Brian Houlihan is a seasoned enterprise architect renowned for his expertise in integrations and the implementation of diverse Cloud platforms. His relentless pursuit of knowledge is evidenced by his recent immersion in artificial intelligence and machine learning.