您的当前位置：首页 linux 去掉csv文件第一行,使用PowerShell删除文本文件的第一行

linux 去掉csv文件第一行,使用PowerShell删除文本文件的第一行

来源：图艺博知识网

使用PowerShell删除文本文件的第一行

我试图在导入它们之前删除大约5000个文本文件的第一行。

我对PowerShell还是很陌生，因此不确定要搜索什么或如何进行此操作。我当前使用伪代码的概念：

set-content file (get-content unless line contains amount)

但是，我似乎无法弄清楚如何执行包含操作。

Buddy Lindsey asked 2020-07-08T22:27:19Z

10个解决方案

36 votes

它不是世界上最高效的，但这应该可以工作：

get-content $file |

select -Skip 1 |

set-content "$file-temp"

move "$file-temp" $file -Force

Richard Berg answered 2020-07-08T22:27:28Z

35 votes

虽然我非常欣赏@hoge的答案，因为它提供了非常简洁的技术和用于将其概括化的包装函数，并且我对此表示鼓励，但我不得不评论使用临时文件的其他两个答案(它像指甲一样at我在黑板上！)。

假设文件不是很大，您可以通过明智地使用括号来强制管道在不连续的部分中进行操作(从而避免使用临时文件)：

(Get-Content $file | Select-Object -Skip 1) | Set-Content $file

...或简写为：

(gc $file | select -Skip 1) | sc $file

Michael Sorens answered 2020-07-08T22:27:57Z

10 votes

使用变量表示法，可以在没有临时文件的情况下进行操作：

${C:\file.txt} = ${C:\file.txt} | select -skip 1

function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {

if ( -not (Test-Path $path -PathType Leaf) ) {

throw "invalid filename"

}

ls $path |

% { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }

}

hoge answered 2020-07-08T22:28:17Z

8 votes

我只需要执行相同的任务，然后StreamReader在读取1.6 GB文件时接管了我计算机上的4 GB RAM。读取整个文件后至少有20分钟没有完成(如Process Explorer中的Read Bytes报道)，此时我不得不将其杀死。

我的解决方案是使用更多的.NET方法：StreamReader + StreamWriter。请参阅此答案，以获得有关讨论性能的好答案：在Powershell中，按记录类型拆分大文本文件的最有效方法是什么？

下面是我的解决方案。是的，它使用一个临时文件，但就我而言，没关系(这是一个庞大的SQL表创建和插入语句文件)：

PS> (measure-command{

$i = 0

$ins = New-Object System.IO.StreamReader "in/file/pa.th"

$outs = New-Object System.IO.StreamWriter "out/file/pa.th"

while( !$ins.EndOfStream ) {

$line = $ins.ReadLine();

if( $i -ne 0 ) {

$outs.WriteLine($line);

}

$i = $i+1;

}

$outs.Close();

$ins.Close();

}).TotalSeconds

它返回：

188.1224443

AASoft answered 2020-07-08T22:28:51Z

5 votes

受AASoft的回答启发，我进一步改进了它：

避免在每个循环中使用循环变量5.3s和与4s的比较

将执行包装到5.3s块中，以始终关闭正在使用的文件

使解决方案适用于从文件开头删除的任意行

这些更改导致以下代码：

$p = (Get-Location).Path

(Measure-Command {

# Number of lines to skip

$skip = 1

$ins = New-Object System.IO.StreamReader ($p + "\test.log")

$outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")

try {

# Skip the first N lines, but allow for fewer than N, as well

for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {

$ins.ReadLine()

}

while( !$ins.EndOfStream ) {

$outs.WriteLine( $ins.ReadLine() )

}

finally {

$outs.Close()

$ins.Close()

}

}).TotalSeconds

第一次更改使我的60 MB文件的处理时间从5.3s减少到4s。其余的更改更加美观。

Oliver answered 2020-07-08T22:29:37Z

3 votes

我刚从一个网站中学到：

Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }

或者，您可以使用别名使其简短，例如：

gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }

Luke Du answered 2020-07-08T22:30:01Z

2 votes

$x = get-content $file

$x[1..$x.count] | set-content $file

就这么多。冗长无聊的解释如下。 Get-content返回一个数组。正如本文和其他Scripting Guy文章中所展示的，我们可以“索引”数组变量。

例如，如果我们定义这样的数组变量，

$array = @("first item","second item","third item")

所以$ array返回

first item

second item

third item

然后我们可以“索引”该数组以仅检索其第一个元素

$array[0]

或仅是第二

$array[1]

或从第二个到最后一个的索引值范围。

$array[1..$array.count]

noam answered 2020-07-08T22:30:43Z

1 votes

跳过不起作用，所以我的解决方法是

$LinesCount = $(get-content $file).Count

get-content $file |

select -Last $($LinesCount-1) |

set-content "$file-temp"

move "$file-temp" $file -Force

Mariusz Biesiekierski answered 2020-07-08T22:31:03Z

0 votes

使用多重分配技术从文件中删除第一行的另一种方法。参考链接

$firstLine, $restOfDocument = Get-Content -Path $filename

$modifiedContent = $restOfDocument

$modifiedContent | Out-String | Set-Content $filename

Venkataraman R answered 2020-07-08T22:31:23Z

-1 votes

对于较小的文件，您可以使用以下命令：

＆C：\ windows \ system32 \ more +1 oldfile.csv> newfile.csv | 零

...但是在处理我的16MB示例文件方面不是很有效。它似乎没有终止并释放newfile.csv上的锁。

danielmbarlow answered 2020-07-08T22:31:52Z

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文