The following Powershell script aims to count and calculate file sizes of each unique file extension in the current folder recursively.
How would you speed up this script for a folder with over 170000 entries?
The problem is the following two statements being repeated for each unique file extension, causing a re-parse of $AllFiles which contains over 170000 entries.
$Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum
*** BEGIN *** Write-Host "Reading directory ... $Path" $AllFiles = Get-ChildItem $Path -Include * -Recurse -OutBuffer 2048| Where { $_.PSisContainer -eq $false } Write-Host "Number of Files = $(@($AllFiles).Count)" $AllSum = ($AllFiles | Measure-Object Length -Sum).Sum If ($AllFiles.Length) { Write-Host "Counting unique file extensions = " -NoNewLine $Extensions = $AllFiles | Select Extension -Unique | Sort Extension Write-Host ($Extensions).Count
foreach ($x in $Extensions){ Write-Host $x.Extension } exit ForEach ($Ext in $Extensions) { Write-Host "Counting ... $Ext = " -NoNewline $Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum Write-Host $FilesSum $Percent = "{0:N0}" -f (($FilesSum / $AllSum) * 100) $Body += " <tr><td>$($Ext.Extension)</td><td>$(@($Files).Count)</td><td><div class=""green"" style=""width:$Percent%"">$('{0:N2}MB' -f ($FilesSum / 1mb))</div></td></tr>" } $HTML = $HeaderHTML + $Body + $FooterHTML Write-Host "Writing file $OutputPath\FilesByExtension.html" $HTML | Out-File $OutputPath\FilesByExtension.html Write-Host "Done!" } Else { Write-Host "`nNo files found in $Path" } *** END ***
Do it in native VFP ...
Dave
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Man-wai Chang Sent: 18 October 2016 06:45 To: ProFox Email List profox@leafe.com Subject: Speeding up a Powershell script
The following Powershell script aims to count and calculate file sizes of each unique file extension in the current folder recursively.
How would you speed up this script for a folder with over 170000 entries?
The problem is the following two statements being repeated for each unique file extension, causing a re-parse of $AllFiles which contains over 170000 entries.
$Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum
*** BEGIN *** Write-Host "Reading directory ... $Path" $AllFiles = Get-ChildItem $Path -Include * -Recurse -OutBuffer 2048| Where { $_.PSisContainer -eq $false } Write-Host "Number of Files = $(@($AllFiles).Count)" $AllSum = ($AllFiles | Measure-Object Length -Sum).Sum If ($AllFiles.Length) { Write-Host "Counting unique file extensions = " -NoNewLine $Extensions = $AllFiles | Select Extension -Unique | Sort Extension Write-Host ($Extensions).Count
foreach ($x in $Extensions){ Write-Host $x.Extension } exit ForEach ($Ext in $Extensions) { Write-Host "Counting ... $Ext = " -NoNewline $Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum Write-Host $FilesSum $Percent = "{0:N0}" -f (($FilesSum / $AllSum) * 100) $Body += " <tr><td>$($Ext.Extension)</td><td>$(@($Files).Count)</td><td><div class=""green"" style=""width:$Percent%"">$('{0:N2}MB' -f ($FilesSum / 1mb))</div></td></tr>" } $HTML = $HeaderHTML + $Body + $FooterHTML Write-Host "Writing file $OutputPath\FilesByExtension.html" $HTML | Out-File $OutputPath\FilesByExtension.html Write-Host "Done!" } Else { Write-Host "`nNo files found in $Path" } *** END ***
-- .~. Might, Courage, Vision. SINCERITY! / v \ 64-bit Ubuntu 9.10 (Linux kernel 2.6.39.3) /( _ )\ http://sites.google.com/site/changmw ^ ^ May the Force and farces be with you!
[excessive quoting removed by server]
On Tue, Oct 18, 2016 at 6:28 AM, Dave Crozier DaveC@flexipol.co.uk wrote:
Do it in native VFP ...
Dave
+1!
Last resort! :)
On Tue, Oct 18, 2016 at 6:28 PM, Dave Crozier DaveC@flexipol.co.uk wrote:
Do it in native VFP ...
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
Malcolm
+1... For a moment I thought the script was written in Mandarin!
Totally unintelligible unless you deal with it all day long.
+1 for Python ... 15 lines max I reckon (I am only a beginner) and 40 Lines in VFP
Dave
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Malcolm Greene Sent: 19 October 2016 14:43 To: profox@leafe.com Subject: Re: Speeding up a Powershell script
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
Malcolm
[excessive quoting removed by server]
For the record:
I speak Cantonese and it is quite different from both Mandarin (Taiwan) and Putonghua (Beijing)! And .... the PLA army speaks Putonghua. :)
On Wed, Oct 19, 2016 at 9:45 PM, Dave Crozier DaveC@flexipol.co.uk wrote:
For a moment I thought the script was written in Mandarin! Totally unintelligible unless you deal with it all day long.
It's native query language is definitely not 100% basic ANSI SQL ...
On Wed, Oct 19, 2016 at 9:43 PM, Malcolm Greene profox@bdurham.com wrote:
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
It is .NET for the command line. You ask specific commandlets to get data for you and then iterate through it. In this case you are making a new container object to hold the "new look" as I vaguely remember.
Currently in my own SharePoint hell of an upgrade that duped all list contents in 2013 test. This kills a friging day removing the dupes..
On Thu, Oct 20, 2016 at 8:29 AM, Man-wai Chang changmw@gmail.com wrote:
It's native query language is definitely not 100% basic ANSI SQL ...
On Wed, Oct 19, 2016 at 9:43 PM, Malcolm Greene profox@bdurham.com wrote:
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
-- .~. Might, Courage, Vision. SINCERITY! / v \ 64-bit Ubuntu 9.10 (Linux kernel 2.6.39.3) /( _ )\ http://sites.google.com/site/changmw ^ ^ May the Force and farces be with you!
[excessive quoting removed by server]
On 2016-10-19 09:43, Malcolm Greene wrote:
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
Malcolm
+1. It's like the UberNerds who want to obfuscate just for the "fun" of it to be UberNerds. I felt similar about RegExp.
Powershell is "obectized" batch files. Your nerd status is reduced when you diss objects like that.
https://gallery.technet.microsoft.com/scriptcenter/Remove-Windows-Store-Apps...
because powershell is very cool. I have 25-30 scripts that I run monthly for a variety of odd reasons. Sharepoint information, reporting and possibly control as well as similar things in SQL Server.
On Fri, Oct 21, 2016 at 7:01 AM, < mbsoftwaresolutions@mbsoftwaresolutions.com> wrote:
On 2016-10-19 09:43, Malcolm Greene wrote:
A great use case for getting your feet wet with Python.
BTW: Powershell syntax wants me to gouge my eyes out. Anyone else feel the same way?
Malcolm
+1. It's like the UberNerds who want to obfuscate just for the "fun" of it to be UberNerds. I felt similar about RegExp.
[excessive quoting removed by server]
Linux:
find . -type f -printf "%f %s\n" | awk '{ PARTSCOUNT=split( $1, FILEPARTS, "." ); EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT]; FILETYPE_MAP[EXTENSION]+=$2 } END { for( FILETYPE in FILETYPE_MAP ) { print FILETYPE_MAP[FILETYPE], FILETYPE; } }' | sort -n
Dave
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Man-wai Chang Sent: 18 October 2016 06:45 To: ProFox Email List profox@leafe.com Subject: Speeding up a Powershell script
The following Powershell script aims to count and calculate file sizes of each unique file extension in the current folder recursively.
How would you speed up this script for a folder with over 170000 entries?
The problem is the following two statements being repeated for each unique file extension, causing a re-parse of $AllFiles which contains over 170000 entries.
$Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum
*** BEGIN *** Write-Host "Reading directory ... $Path" $AllFiles = Get-ChildItem $Path -Include * -Recurse -OutBuffer 2048| Where { $_.PSisContainer -eq $false } Write-Host "Number of Files = $(@($AllFiles).Count)" $AllSum = ($AllFiles | Measure-Object Length -Sum).Sum If ($AllFiles.Length) { Write-Host "Counting unique file extensions = " -NoNewLine $Extensions = $AllFiles | Select Extension -Unique | Sort Extension Write-Host ($Extensions).Count
foreach ($x in $Extensions){ Write-Host $x.Extension } exit ForEach ($Ext in $Extensions) { Write-Host "Counting ... $Ext = " -NoNewline $Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum Write-Host $FilesSum $Percent = "{0:N0}" -f (($FilesSum / $AllSum) * 100) $Body += " <tr><td>$($Ext.Extension)</td><td>$(@($Files).Count)</td><td><div class=""green"" style=""width:$Percent%"">$('{0:N2}MB' -f ($FilesSum / 1mb))</div></td></tr>" } $HTML = $HeaderHTML + $Body + $FooterHTML Write-Host "Writing file $OutputPath\FilesByExtension.html" $HTML | Out-File $OutputPath\FilesByExtension.html Write-Host "Done!" } Else { Write-Host "`nNo files found in $Path" } *** END ***
-- .~. Might, Courage, Vision. SINCERITY! / v \ 64-bit Ubuntu 9.10 (Linux kernel 2.6.39.3) /( _ )\ http://sites.google.com/site/changmw ^ ^ May the Force and farces be with you!
[excessive quoting removed by server]
Or:
for i in $(find . -type f | perl -ne 'print $1 if m/.([^./]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r
Assuming you have extglob enabled: shopt -s extglob
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Dave Crozier Sent: 19 October 2016 14:50 To: ProFox Email List profox@leafe.com Subject: RE: Speeding up a Powershell script
Linux:
find . -type f -printf "%f %s\n" | awk '{ PARTSCOUNT=split( $1, FILEPARTS, "." ); EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT]; FILETYPE_MAP[EXTENSION]+=$2 } END { for( FILETYPE in FILETYPE_MAP ) { print FILETYPE_MAP[FILETYPE], FILETYPE; } }' | sort -n
Dave
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Man-wai Chang Sent: 18 October 2016 06:45 To: ProFox Email List profox@leafe.com Subject: Speeding up a Powershell script
The following Powershell script aims to count and calculate file sizes of each unique file extension in the current folder recursively.
How would you speed up this script for a folder with over 170000 entries?
The problem is the following two statements being repeated for each unique file extension, causing a re-parse of $AllFiles which contains over 170000 entries.
$Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum
*** BEGIN *** Write-Host "Reading directory ... $Path" $AllFiles = Get-ChildItem $Path -Include * -Recurse -OutBuffer 2048| Where { $_.PSisContainer -eq $false } Write-Host "Number of Files = $(@($AllFiles).Count)" $AllSum = ($AllFiles | Measure-Object Length -Sum).Sum If ($AllFiles.Length) { Write-Host "Counting unique file extensions = " -NoNewLine $Extensions = $AllFiles | Select Extension -Unique | Sort Extension Write-Host ($Extensions).Count
foreach ($x in $Extensions){ Write-Host $x.Extension } exit ForEach ($Ext in $Extensions) { Write-Host "Counting ... $Ext = " -NoNewline $Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum Write-Host $FilesSum $Percent = "{0:N0}" -f (($FilesSum / $AllSum) * 100) $Body += " <tr><td>$($Ext.Extension)</td><td>$(@($Files).Count)</td><td><div class=""green"" style=""width:$Percent%"">$('{0:N2}MB' -f ($FilesSum / 1mb))</div></td></tr>" } $HTML = $HeaderHTML + $Body + $FooterHTML Write-Host "Writing file $OutputPath\FilesByExtension.html" $HTML | Out-File $OutputPath\FilesByExtension.html Write-Host "Done!" } Else { Write-Host "`nNo files found in $Path" } *** END ***
-- .~. Might, Courage, Vision. SINCERITY! / v \ 64-bit Ubuntu 9.10 (Linux kernel 2.6.39.3) /( _ )\ http://sites.google.com/site/changmw ^ ^ May the Force and farces be with you!
[excessive quoting removed by server]
On 19/10/2016 14:51, Dave Crozier wrote:
Or:
for i in $(find . -type f | perl -ne 'print $1 if m/.([^./]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r
Assuming you have extglob enabled: shopt -s extglob
Feck me Dave, what was that you were saying about Mandarin/unintelligible ;-)
Peter
This communication is intended for the person or organisation to whom it is addressed. The contents are confidential and may be protected in law. Unauthorised use, copying or disclosure of any of it may be unlawful. If you have received this message in error, please notify us immediately by telephone or email.
www.whisperingsmith.com
Whispering Smith Ltd Head Office:61 Great Ducie Street, Manchester M3 1RR. Tel:0161 831 3700 Fax:0161 831 3715
London Office:17-19 Foley Street, London W1W 6DW Tel:0207 299 7960
30 lines in VFP. As a bonus, you end up with a cursor of data, and an on-topic thread!
CREATE CURSOR curFiles (filesPK int NOT NULL autoinc, extension c(12), totalsize I) INDEX on extension TAG extension lnResult = recurse() WAIT WINDOW NOWAIT "Completed counting " + TRANSFORM(lnResult) + " directory entries" BROWSE
FUNCTION Recurse (tcPath) IF EMPTY(tcPath) tcPath = "." ENDIF LOCAL laDir[1], i, lnCount, lnResult STORE 0 TO lnResult lnCount=ADIR(laDir, tcPath+'*','D') FOR i = 1 TO lnCount IF 'D' $ laDir[i,5] && drill into subdirectories IF LEFT(laDir[i,1],1) #'.' lnResult = lnResult + Recurse(tcPath+ADDBS(laDir[i,1])) ENDIF LOOP ENDIF * WAIT WINDOW NOWAIT tcPath+laDir[i,1] * ? tcPath+laDir[i,1] AddToFile(JUSTEXT(laDir[i,1]), laDir[i,2]) NEXT RETURN lnCount + lnResult
FUNCTION AddToFile(tcExtension, tiFileSize) LOCATE FOR extension=tcExtension IF FOUND() replace totalsize WITH totalsize + tiFileSize NEXT 1 IN curFiles ELSE INSERT INTO curfiles (extension, totalsize) VALUES (tcExtension, tiFileSize) ENDIF RETURN
On Wed, Oct 19, 2016 at 10:31 AM, Peter Cushing pcushing@whisperingsmith.com wrote:
On 19/10/2016 14:51, Dave Crozier wrote:
Or:
for i in $(find . -type f | perl -ne 'print $1 if m/.([^./]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r
Assuming you have extglob enabled: shopt -s extglob
Feck me Dave, what was that you were saying about Mandarin/unintelligible ;-)
Peter
This communication is intended for the person or organisation to whom it is addressed. The contents are confidential and may be protected in law. Unauthorised use, copying or disclosure of any of it may be unlawful. If you have received this message in error, please notify us immediately by telephone or email. www.whisperingsmith.com
Whispering Smith Ltd Head Office:61 Great Ducie Street, Manchester M3 1RR. Tel:0161 831 3700 Fax:0161 831 3715 London Office:17-19 Foley Street, London W1W 6DW Tel:0207 299 7960
[excessive quoting removed by server]
I actually wrote and compiled a console C program to do it using VS 2013 Pro. :)
https://sites.google.com/site/changmw/scripting/dire/dire-20161020.zip?attre...
On Thu, Oct 20, 2016 at 4:13 AM, Ted Roche tedroche@gmail.com wrote:
30 lines in VFP. As a bonus, you end up with a cursor of data, and an on-topic thread!
Man-Wai, I hope you didn't take the comment as disrespectful as It wasn't meant to be in any way at all ;-)
I must admit that Powershell is a little like Regular Expressions in that once you use it on a regular basis it seems simple but if you only visit it once in a blue moon then the learning or re-learning curve is huge!
Hope you found a solution anyway.
Dave
-----Original Message----- From: ProFox [mailto:profox-bounces@leafe.com] On Behalf Of Man-wai Chang Sent: 20 October 2016 14:35 To: ProFox Email List profox@leafe.com Subject: Re: Speeding up a Powershell script
I actually wrote and compiled a console C program to do it using VS 2013 Pro. :)
https://sites.google.com/site/changmw/scripting/dire/dire-20161020.zip?attre...
On Thu, Oct 20, 2016 at 4:13 AM, Ted Roche tedroche@gmail.com wrote:
30 lines in VFP. As a bonus, you end up with a cursor of data, and an on-topic thread!
-- .~. Might, Courage, Vision. SINCERITY! / v \ 64-bit Ubuntu 9.10 (Linux kernel 2.6.39.3) /( _ )\ http://sites.google.com/site/changmw ^ ^ May the Force and farces be with you!
[excessive quoting removed by server]
On Thu, Oct 20, 2016 at 10:50 PM, Dave Crozier DaveC@flexipol.co.uk wrote:
I hope you didn't take the comment as disrespectful as It wasn't meant to be in any way at all ;-)
Of course not. It's easier to do it in Visual Foxpro!
I must admit that Powershell is a little like Regular Expressions in that once you use it on a regular basis it seems simple but if you only visit it once in a blue moon then the learning or re-learning curve is huge! Hope you found a solution anyway.
Iterating the $AllFiles is the fastest way out I believe. Well, let see whether I could copy some codes from others... :)
Well, I suppose if I wanted a console program, I could have used FP-DOS, though SYS(2000) and FSIZE() instead of ADIR().
But I have a SQL-queryable cursor and can use the report writer or textmerge to produce output in the format desired.
On Thu, Oct 20, 2016 at 9:35 AM, Man-wai Chang changmw@gmail.com wrote:
I actually wrote and compiled a console C program to do it using VS 2013 Pro. :)
On 2016-10-19 10:31, Peter Cushing wrote:
On 19/10/2016 14:51, Dave Crozier wrote:
Or:
for i in $(find . -type f | perl -ne 'print $1 if m/.([^./]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r
Assuming you have extglob enabled: shopt -s extglob
Feck me Dave, what was tha
LOL!!!!!
Powershell does not have awk and find. Are you using portable Ubuntu in Windows 10?
On Wed, Oct 19, 2016 at 9:49 PM, Dave Crozier DaveC@flexipol.co.uk wrote:
Linux:
find . -type f -printf "%f %s\n" | awk '{ PARTSCOUNT=split( $1, FILEPARTS, "." ); EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT]; FILETYPE_MAP[EXTENSION]+=$2 } END { for( FILETYPE in FILETYPE_MAP ) { print FILETYPE_MAP[FILETYPE], FILETYPE; } }' | sort -n
{ Write-Host "Counting ... $Ext = " -NoNewline $Files = $AllFiles | Where { $_.Extension -eq $Ext.Extension } $FilesSum = ($Files | Measure-Object Length -Sum).Sum Write-Host $FilesSum