Unexpected results from a group by clause

List overview All Threads
Download

newer

older

ProFox List Statistics for March...

Tuning VFP SQL Pass-Through For...

Joe Yoder

30 Mar 2018 30 Mar '18

3 a.m.

I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html ---

Show replies by date

Vince Teachout

30 Mar 30 Mar

3:47 a.m.

Joe,

Use the Order by Clause to set your ordering. Group by will, by design, eliminate duplicates. GROUP BY is used for aggregate functions, such as sum(), count() etc.

On 03/29/18 11:00 PM, Joe Yoder wrote:

...

I have an SQL select statement that includes a group by clause to order the output for reporting. ...

...

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

--- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus

mbsoftwaresolutions＠mbsoftwaresolutions.com

4:37 a.m.

On 2018-03-29 23:00, Joe Yoder wrote:

...

I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

Hi Joe,

Replace the GROUP BY with the ORDER BY clause and it won't drop any records.

hth, --Mike

Richard Kaye

11:34 a.m.

You can also add COUNT(*) to your query to see how many rows are duplicated and use a HAVING to see just the duplicated rows.

-----Original Message----- From: ProfoxTech profoxtech-bounces@leafe.com On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, March 30, 2018 12:37 AM To: profoxtech@leafe.com Subject: Re: Unexpected results from a group by clause

On 2018-03-29 23:00, Joe Yoder wrote:

...

I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

Hi Joe,

Replace the GROUP BY with the ORDER BY clause and it won't drop any records.

hth, --Mike

Stephen Russell

2:32 p.m.

The having clause allows the query to limit the output of the data set, it doesn't refocus on the raw data for additional processing.

select PO.[BuyFromBuinessPartner], PO.[PurchaseOrder], PO.[Division], POL.[Item], POL.[Price] -- all items needed in group by , sum( POL.[OrderedQuantity]) totalQuant -- reason for the group by from [dbo].[FactPurchaseOrders] PO left join [dbo].[FactPurchaseOrderLines] POL on PO.[PurchaseOrder] = POL.[PurchaseOrder] and PO.[Division]= POL.[Division] where PO.Division ='RCT' and PO.OrderDate between '1-1-2018' and '4-1-2018' group by PO.[BuyFromBuinessPartner], PO.[PurchaseOrder], PO.[Division],POL.[Item], POL.[Price]

Having sum( POL.[OrderedQuantity]) > 100000 -- Just show me anything over 100000 in quantity

HTH

On Fri, Mar 30, 2018 at 6:34 AM, Richard Kaye rkaye@invaluable.com wrote:

...

You can also add COUNT(*) to your query to see how many rows are duplicated and use a HAVING to see just the duplicated rows.

--

rk

-----Original Message----- From: ProfoxTech profoxtech-bounces@leafe.com On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, March 30, 2018 12:37 AM To: profoxtech@leafe.com Subject: Re: Unexpected results from a group by clause

On 2018-03-29 23:00, Joe Yoder wrote:

...
I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

Hi Joe,

Replace the GROUP BY with the ORDER BY clause and it won't drop any records.

hth, --Mike

[excessive quoting removed by server]

Richard Kaye

2:48 p.m.

I think you're mixing this up with the other SQL thread. The OP was a little puzzled by GROUP behavior and what he wants, I think, is to identify duplicate records. He didn't say what he wants to do with the duplicates. I just suggested the count and having so he could get a feel for how many dupes are there.

-----Original Message----- From: ProfoxTech profoxtech-bounces@leafe.com On Behalf Of Stephen Russell Sent: Friday, March 30, 2018 10:33 AM To: profoxtech@leafe.com Subject: Re: Unexpected results from a group by clause

The having clause allows the query to limit the output of the data set, it doesn't refocus on the raw data for additional processing.

Having sum( POL.[OrderedQuantity]) > 100000 -- Just show me anything over 100000 in quantity

HTH

On Fri, Mar 30, 2018 at 6:34 AM, Richard Kaye rkaye@invaluable.com wrote:

...

You can also add COUNT(*) to your query to see how many rows are duplicated and use a HAVING to see just the duplicated rows.

--

rk

-----Original Message----- From: ProfoxTech profoxtech-bounces@leafe.com On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, March 30, 2018 12:37 AM To: profoxtech@leafe.com Subject: Re: Unexpected results from a group by clause

On 2018-03-29 23:00, Joe Yoder wrote:

...
I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

Hi Joe,

Replace the GROUP BY with the ORDER BY clause and it won't drop any records.

hth, --Mike

[excessive quoting removed by server]

Ricardo Araoz

8:39 p.m.

Yup, it's like a "where" but it acts kind of after the grouping so you can use aggregate functions to limit the output.

On 30/03/18 11:32, Stephen Russell wrote:

...

The having clause allows the query to limit the output of the data set, it doesn't refocus on the raw data for additional processing.

select PO.[BuyFromBuinessPartner], PO.[PurchaseOrder], PO.[Division], POL.[Item], POL.[Price] -- all items needed in group by , sum( POL.[OrderedQuantity]) totalQuant -- reason for the group by from [dbo].[FactPurchaseOrders] PO left join [dbo].[FactPurchaseOrderLines] POL on PO.[PurchaseOrder] = POL.[PurchaseOrder] and PO.[Division]= POL.[Division] where PO.Division ='RCT' and PO.OrderDate between '1-1-2018' and '4-1-2018' group by PO.[BuyFromBuinessPartner], PO.[PurchaseOrder], PO.[Division],POL.[Item], POL.[Price]

Having sum( POL.[OrderedQuantity]) > 100000 -- Just show me anything over 100000 in quantity

HTH

On Fri, Mar 30, 2018 at 6:34 AM, Richard Kaye rkaye@invaluable.com wrote:

...
You can also add COUNT(*) to your query to see how many rows are duplicated and use a HAVING to see just the duplicated rows.

--

rk

-----Original Message----- From: ProfoxTech profoxtech-bounces@leafe.com On Behalf Of mbsoftwaresolutions@mbsoftwaresolutions.com Sent: Friday, March 30, 2018 12:37 AM To: profoxtech@leafe.com Subject: Re: Unexpected results from a group by clause

On 2018-03-29 23:00, Joe Yoder wrote:

...
I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

Hi Joe,

Replace the GROUP BY with the ORDER BY clause and it won't drop any records.

hth, --Mike

[excessive quoting removed by server]

Jürgen Wondzinski

10:56 p.m.

New subject: AW: Unexpected results from a group by clause

HAVING is basically a Post-Processing of the SQL-Result. Therefor HAVING uses the newly defined column names, not the original field names:

SELECT ZipCode, count(*) AS HowMany FROM Adresses GROUP BY 1 WHERE State = "TX" HAVING HowMany > 1000

-----Ursprüngliche Nachricht----- Von: ProFox profox-bounces@leafe.com Im Auftrag von Ricardo Araoz Gesendet: Freitag, 30. März 2018 22:40 An: profox@leafe.com Betreff: Re: Unexpected results from a group by clause

Yup, it's like a "where" but it acts kind of after the grouping so you can use aggregate functions to limit the output.

Ted Roche

3:41 p.m.

GROUP BY is intended to GROUP multiple records together so you can calculate MAX(), MIN(), SUM() and similar "aggregate" functions on multiple records and report them in one line.

ORDER BY specifies the ORDER of the resulting recordset.

It's just a confusing coincidence that VFP usually reports records that are GROUPed in the correct order. Don't count on that, as it's not part of the SQL standard to do that, and some data engines don't.

On Thu, Mar 29, 2018 at 11:00 PM, Joe Yoder joe@wheypower.com wrote:

...

I have an SQL select statement that includes a group by clause to order the output for reporting. It gets its data from a table that potentially includes legitimate duplicate records.

SELECT account, date, memo, paid_amoun; FROM QB; GROUP BY account, date, memo, paid_amoun; INTO CURSOR det

I happened to discover that the output of the select statement does not include duplicate records. Is this expected behavior? If so , how should one group data with duplicate records?

Thanks in advance,

Joe

--- StripMime Report -- processed MIME parts --- multipart/alternative text/plain (text body -- kept) text/html

[excessive quoting removed by server]

2882

Age (days ago)

2882

Last active (days ago)

profox@leafe.com

8 comments

8 participants

tags (0)

participants (8)

Joe Yoder
Jürgen Wondzinski
mbsoftwaresolutions＠mbsoftwaresolutions.com
Ricardo Araoz
Richard Kaye
Stephen Russell
Ted Roche
Vince Teachout