How do I use the SQLITE backend to provide user-defined functions for Python Blaze?

I use the sqlite database connected to Blaze
df = bz.Data(“sqlite: ///< my database>)
Everything is normal, but I don’t know How to provide a user-defined function in the interaction with df.
I have a column named IP in df, which is the text containing the IP address. I also have a function toSubnet(x,y), which receives IP address (x) in text format and return its /y subnet. For example:

out = toSubnet('1.1.1.1',24)
out
1.1.1.0/24

Now, if I want to map all IPs to their /14 subnet, I use:

< pre>df.IP.map(lambda x:toSubnet(x,14),’string’)

This applies when the backend is CSV. But using the sqlite backend I get NotImplementedError.
What’s wrong with this?

Note: This doesn’t tell you how to do exactly what you want , But it explains why it doesn’t work properly, and SQLite can be used in the next step.

The problem you have is that it is very difficult to execute arbitrary Python code efficiently against any SQL database.

Blaze uses SQLAlchemy to obtain user code and converts it into SQL as much as possible. I think there is no way to achieve this.

Since almost every database has different processing user-defined functions (UDF ) Method, so it takes a lot of work to build an API that allows the following:

>Users define functions in Python
>Convert pure Python functions into the UDF native of the database.

In other words, the Python interface of SQLite has a way to register Python functions that can be executed in SQL statements:

https://docs.python.org/2/library/sqlite3.html #sqlite3.Connection.create_function

Currently, there is no way to use the SQL backend to express UDF using Blaze, although this can be It is implemented as a new expression type that allows users to register functions through the db API of the underlying database.

I use the sqlite database connected to Blaze
df = bz. Data(“sqlite: ///)
Everything is ok, but I don’t know how to provide user-defined functions in the interaction with df.
I have a column named IP in df , It is the text containing the IP address. I also have a function toSubnet(x,y), which receives the IP address (x) in text format and returns its /y subnet. For example:

out = toSubnet('1.1.1.1',24)
out
1.1.1.0/24

Now, if I want to map all IP To their /14 subnet, I use:

df.IP.map(lambda x:toSubnet(x,14),'string')

< p>This applies when the backend is CSV. But with the sqlite backend I get NotImplementedError.
What’s wrong with this?

Note: This does not tell you how to do exactly what you want, but it explains why it doesn’t work properly and you can use SQLite in the next step.

The problem you are encountering is that it is very difficult to execute arbitrary Python code efficiently against any SQL database.

Blaze uses SQLAlchemy to obtain user code and convert it into as much as possible SQL, I don’t think there is a way to achieve this.

Since almost every database has a different way of handling user-defined functions (UDF), it takes a lot of work to build an API that allows the following:

p>

>Users define functions in Python
>Convert pure Python functions into the UDF native of the database.

That is to say, the Python interface of SQLite has a way to register in Python functions executed in SQL statements:

https://docs.python.org/2/library/sqlite3.html#sqlite3.Connection.create_function

There is currently no way to use SQL The backend uses Blaze to express UDF, although this can be implemented as a new expression type that allows users to register functions through the db API of the underlying database.

Leave a Comment

Your email address will not be published.