Does the PostgreSQL table? Not enough storage?

I have a table of about 3.8 million rows. When I query the entire table, I get

ERROR: value overflows numeric format

quote the value returned by the user-defined function.
However, if I roughly divide the table in half (see below), everything works fine.

SELECT day,item,price,
CAST(my_func(price) OVER (PARTITION BY item ORDER BY day) AS numeric(8,2)),
FROM my_table
--WHERE day <'3/1/2013';
--WHERE day >= '3/1/2013';

Statement with WHERE clause Execute without error.

The price is a number (8, 2), and there is no number in the price column greater than the number (8, 2). Anyway, change the format to a number (20, 2) ) Makes no difference.

This is the table definition:

CREATE TABLE my_table
(
item character(5) NOT NULL ,
day date NOT NULL,
price numeric(8,2),
CONSTRAINT my_table_pk PRIMARY KEY (item, day)
);

…… And function:

CREATE OR REPLACE FUNCTION my_func2 (avg numeric, IN price numeric)
RETURNS numeric AS $$
DECLARE
alpha numeric ;
BEGIN
alph a := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price - avg is NULL for the first row, so price is returned
ELSE round((alpha * price + (1-alpha) * avg),2)
END;
END;
$$LANGUAGE PLPgSQL;

…for aggregation:

CREATE AGGREGATE my_func(numeric) (SFUNC = my_func2, STYPE = numeric);

< div class="answer"> The error occurred in your actor operation. The format number (8,2) is very strict, maybe my_func() returns a value that does not meet the format definition. To demonstrate this, please check the following query:

select 12.34::numeric(8,2);
numeric
---------
12.34< br />
select 12.345678::numeric(8,2);
numeric
---------
12.35

select 12.3456789::numeric(8,2);
numeric
---------
12.35

select 123456.123456789::numeric(8,2 );
numeric
-----------
123456.12

select 1234567.123456789::numeric(8,2);
ERROR: numeric field overflow
D ETAIL: A field with precision 8, scale 2 must round to an absolute value less than 10^6.

select 1234567.8::numeric(8,2);
ERROR: numeric field overflow
DETAIL: A field with precision 8, scale 2 must round to an absolute value less than 10^6.

If you notice, the total number of returned numbers must not exceed 8 digits, and There are always 2 decimal places. The last two queries will be wrong because they should return more than 8 digits. For example, you want to round the number 1234567.123456789 to 1234567.12, but 1234567.12 consists of 9 digits instead of 8. For the number 1234567.8 The same is true even if you have 8 digits. This is because in the returned value you want 2 decimal digits, so postgres sohuld output 1234567.80 but again, here you have 9 digits instead of 8 digits.

In other words, you have different ways to solve this problem:

>Use the number (16,2) for the 16-digit total (choose the desired number), increase my_func() The desired integer decimal number.
>Use other number formats, such as numbers or real numbers. For example: (my_func(price)OVER(PARTITION BY item ORDER BY day)):: real
>If you need a specific Decimal length and unlimited total digits, please try round(my_func(price)OVER(PARTITION BY item ORDER BY day), 2). Otherwise, edit my_func() to return round(returned_value, 2).

< p>To help you understand and/or find the cause of the error, please consider this. For evaluating at least one or one row of my_func(), you will get a number with more than 6 digits on the left. Which row to find is generated Error, you only need to execute the following query:

WITH not_casted AS (
SELECT day,item,price,
my_func(price) OVER (PAR TITION BY item ORDER BY day) AS fprice
FROM my_table
)
SELECT * FROM not_casted
WHERE fprice> 999999.99

The rows returned by this query will be generated Coercion error. Obviously, this method is valid if you have not type converted the number (8, 2) inside my_func(), otherwise an error will be generated on the value of the type conversion. Without knowing the function code, It’s impossible to make other assumptions.

UPDATE

I proposed an example based on simulation. The code performs the following operations:
– Create different types of conversion and rounding methods Different AGGREGATE
– Perform each AGGREGATE on a simulated random sample as data (hopefully). It generates 10 prices per day, and each price has its own item in 31 days, totaling 10 items. This is not a good way to prove accuracy loss It is not important, so if the data is not simulated correctly, please don’t blame me:)

This is the code for creating four functions and aggregation:

-- typecast price and arithmetics to numeric(8,2)
CREATE OR REPLACE FUNCTION my_func_numeric_8_2_a (avg numeric(8,2), IN price numeric(8,2))
RETURNS numeric(8,2) AS $ $
DECLARE
alpha numeric;
BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price< br /> ELSE (alpha * price + (1-alpha) * avg)::numeric(8,2)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric_8_2(numeric(8,2)) (SFUNC = my_fun c_numeric_8_2_a, STYPE = numeric(8,2));


-- typecast price and arithmetics to numeric and round(arithmetics, 2)
CREATE OR REPLACE FUNCTION my_func_numeric_round_a(avg numeric, IN price numeric)
RETURNS numeric AS $$
DECLARE
alpha numeric;
BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE round((alpha * price + (1-alpha) * avg), 2)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric_round(numeric) (SFUNC = my_func_numeric_round_a, STYPE = numeric);

-- no typecast (double precision type)
CREATE OR REPLACE FUNCTION my_func_dp_a(avg double precision, IN price double precision)
RETURNS double precision AS $$
DECLARE
alpha double precision;
BEGIN
alpha := 2.0/ 51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE (alpha * price + (1-alpha) * avg)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_dp(double precision) (SFUNC = my_func_dp_a, STYPE = double precision);
< br />-- typecast price and arithmetics to numeric
CREATE OR REPLACE FUNCTION my_func_numeric_a(avg numeric, IN price numeric)
RETURNS numeric AS $$
DECLARE
alpha numeric;< br />BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE (alpha * price + (1-alpha) * avg)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric(numeric) (SFUNC = my_func_numeric_a, STYPE = numeric);

Now, the code simulates the data and applies these three functions:

WITH sample AS
(
SELECT "day", (random()) *10 AS price, generate_series(1,10)::text AS item
FROM (SELECT generate_series('2000-01-01'::timestamp, '2000-01-31'::timestamp, '1 day '::interval)::date AS "day") AS calendar
)
SELECT "day", item, price,
- typecast price and arithmetics to numeric(8,2)
my_func_numeric_8_2(price::numeric(8,2)) OVER (PARTITION BY item ORDER BY "day") AS numeric_8_2,< br />
- typecast price and arithmetics to numeric and round(arithmetics, 2)
my_func_numeric_round(price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric_round,

- typecast price and arithmetics to numeric and round the final result
round(my_func_numeric(price::numeric) OVER (PARTITION BY item ORDER BY "day"), 2) AS round_numeric,

- no typecast (double precision type)
my_func_dp(price) OVER (PARTITION BY item ORDER BY "day") AS no_typecast,

- typecast price and arithmetics to numeric
my_func_numeric(price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric
FROM sample
ORDER BY item, "day"

Due to the use of random(), each query execution will generate different results. Scroll down the results and you will see that many rows have different values, even if the price The grid is the same as calculating all four values. In addition, the columns are sorted by reducing the precision loss (or increasing the precision): my_func_dp(price) is the most accurate among the four, and my_func_numeric_8_2(price :: numeric(8,2) ) Is less precise, but the most “precise” precise”.

If you run the previous query from the command line, you will notice that my_func_numeric(price :: numeric) returns a number with an increased length because the number is exhausted. It may be precise, so its fixed length may be different. If you execute it from pgAdmin, you will get a rounded number of the full length.

Screenshot of a portion of the simulated results.

I have a table of about 3.8 million rows. When I query the entire table, I get

ERROR: value overflows numeric format

quoting user-defined The value returned by the function.
However, if I roughly divide the table in half (see below), everything works fine.

SELECT day,item,price,
CAST(my_func(price) OVER (PARTITION BY item ORDER BY day) AS numeric(8,2)),
FROM my_table
--WHERE day <'3/1/2013';
--WHERE day >= '3/1/2013';

The statement with the WHERE clause is executed without error.

The price is a number (8, 2), no number in the price column is greater than the number (8,2). In any case, changing the format to a number (20,2) makes no difference.

This is the table definition:

CREATE TABLE my_table
(
item character(5) NOT NULL,
day date NOT NULL,
price numeric(8,2),
CONSTRAINT my_table_pk PRIMARY KEY (item, day)
);

… …And function:

CREATE OR REPLACE FUNCTION my_func2 (avg numeric, IN price numeric)
RETURNS numeric AS $$
DECLARE
alpha numeric;
BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price - avg is NULL for the first row, so price is returned
ELSE round((alpha * price + (1-alpha) * avg),2)
END;
END;
$$LANGUAGE PLPgSQL;

…for aggregation:

CREATE AGGREGATE my_func(numeric) (SFUNC = my_func2, STYPE = numeric);

< /p>

The error occurred in your actor operation. The format number (8,2) is very strict, and maybe my_func() returns a value that does not meet the format definition. To demonstrate this, please check the following query:

select 12.34::numeric(8,2);
numeric
---------
12.34

select 12.345678::numeric(8,2);
numer ic
---------
12.35

select 12.3456789::numeric(8,2);
numeric
--- ------
12.35

select 123456.123456789::numeric(8,2);
numeric
-----------
123456.12

select 1234567.123456789::numeric(8,2);
ERROR: numeric field overflow
DETAIL: A field with precision 8, scale 2 must round to an absolute value less than 10^6.

select 1234567.8::numeric(8,2);
ERROR: numeric field overflow
DETAIL: A field with precision 8, scale 2 must round to an absolute value less than 10^6.

If you notice, the total number of returned numbers must not exceed 8 digits and always have 2 decimal places. The last two queries will be wrong , Because they should return more than 8 digits. For example, you want to round the number 1234567.123456789 to 1234567.12, but 1234567.12 consists of 9 digits, not 8. The same is true for the number 1234567.8, even if you have 8 digits. This It’s because you want 2 decimal digits in the returned value, so postgres sohuld outputs 1234567.80 but again, here you have 9 digits instead of 8 digits.

In other words, you have different Ways to solve this problem:

>Use the number (16,2) as the 16-digit total (select the required number), increase the integer decimal number required by my_func().
>Use other Number format, such as number or real number. For example: (my_func(price)OVER(PARTITION BY item ORDER BY day)):: real
>If you need specific decimal length and unlimited total Digits, please try round(my_func(price)OVER(PARTITION BY item ORDER BY day), 2). Otherwise, edit my_func() to return round(returned_value, 2).

To help you understand And/or find the cause of the error, please consider this. For evaluating at least one or one line of my_func(), you will get a number with more than 6 digits on the left. To find which line generated the error, you just need to do The following query:

WITH not_casted AS (
SELECT day,item,price,
my_func(price) OVER (PARTITION BY item ORDER BY day) AS fprice
FROM my_table
)
SELECT * FROM not_casted
WHERE fprice> 999999.99

The rows returned by this query will generate cast errors. Obviously, if you If there is no type conversion of the number (8, 2) inside my_func(), this method is valid, otherwise an error will be generated on the value of the type conversion. Without knowing the function code, it is impossible to make other assumptions. /p>

UPDATE

I proposed an example based on simulation. The code performs the following operations:
– Create different AGGREGATE with different type conversion and rounding methods
– Right as A simulated random sample of the data executes each AGGREGATE (hopefully). It generates 10 prices per day, and each price has its own item in 31 days for a total of 10 items. This is not important for proving the loss of accuracy, so if the data is not correct Simulation, please don’t blame me:)

This is the code for creating four functions and aggregations:

-- typecast price and arithmetics to numeric(8, 2)
CREATE OR REPLACE FUNCTION my_func_numeric_8_2_a (avg numeric(8,2), IN price numeric(8,2))
RETURNS numeric(8,2) AS $$
DECLARE
alpha numeric;
BEGIN alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE (alpha * price + (1-alpha) * avg):: numeric(8,2)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric_8_2(numeric(8,2)) (SFUNC = my_func_numeric_8_2_a, STYPE = numeric (8,2));


-- typecast price and arithmetics to numeric and round(arithmetics, 2)
CREATE OR REPLACE FUNCTION my_func_numeric_round_a(avg numeric, IN price numeric )
RETURNS numeric AS $$
DECLARE
alpha numeric;
BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE round((alpha * price + (1-alpha) * avg), 2)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric_round(numeric) (SFUNC = my_func_numeric_round_a, STYPE = numeric);

-- no typecast (double precision type)
CREATE OR REPLACE FUNCTION m y_func_dp_a(avg double precision, IN price double precision)
RETURNS double precision AS $$
DECLARE
alpha double precision;
BEGIN
alpha := 2.0/51;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE (alpha * price + (1-alpha) * avg)
END;
END;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_dp(double precision) (SFUNC = my_func_dp_a, STYPE = double precision);

-- typecast price and arithmetics to numeric
CREATE OR REPLACE FUNCTION my_func_numeric_a(avg numeric, IN price numeric)
RETURNS numeric AS $$
DECLARE
alpha numeric;
BEGIN
alpha := 2.0/51 ;
RETURN
CASE
WHEN avg IS NULL THEN price
ELSE (alpha * price + (1-alpha) * avg)
END;
END ;
$$LANGUAGE PLPgSQL;
CREATE AGGREGATE my_func_numeric(numeric) (SFUNC = my_func_numeric_a, STYPE = numeric);

Now, the code simulates the data and Apply these three functions:

WITH sample AS
(
SELECT "day", (random())*10 AS price, generate_series(1 ,10)::text AS item
FROM (SELECT generate_series('2000-01-01'::timestamp, '2000-01-31'::timestamp, '1 day'::interval)::date AS "day") AS calendar
)
SELECT "day", item, price,
- typecast price and arithmetics to numeric(8,2)
my_func_numeric_8_2(price: :numeric(8,2)) OVER (PARTITION BY item ORDER BY "day") AS numeric_8_2,

- typecast price and arithmetics to numeric and round(arithmetics, 2)
my_func_numeric_round (price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric_round,

- typecast price and arithmetics to numeric and round the final result
round(my_func_numeric(price ::numeric) OVER (PARTITION BY item ORDER BY "day"), 2) AS round_numeric,

- no typecast (double precision type)
my_func_dp(price) OVER (PARTITION BY item O RDER BY "day") AS no_typecast,

- typecast price and arithmetics to numeric
my_func_numeric(price::numeric) OVER (PARTITION BY item ORDER BY "day") AS numeric< br />FROM sample
ORDER BY item, "day"

Due to the use of random(), each query execution will generate different results. Scroll down the results and you will see many The rows have different values, even if the price is the same as all four values ​​calculated. In addition, the columns are sorted by reducing the precision loss (or increasing the precision): my_func_dp(price) is the most accurate among the four, and my_func_numeric_8_2(price: : numeric(8,2)) is less precise, but the most “precise” precise”.

If you run the previous query from the command line, you will notice that my_func_numeric(price :: numeric) returns the length The increased number, because the number is as precise as possible, so its fixed length may be different. If you execute it from pgAdmin, you will get a rounded number of the full-length number.

Screenshot of a portion of the simulated results.

Leave a Comment

Your email address will not be published.