Getting a Recursive FTP File List in .Net

image

Even though it’s 2009, there are still some dark areas of the internet that haven’t been upgraded to modern standards.  FTP is one of them.

FTP is closer to HTTP than you think – results of FTP commands are sent back as plain text.  There is no field delimiter, no standard field order, and not even a standard of what data gets returned.  FTP was written with the idea that the user is on a text console, and would be reading the messages from the server directly – clients shouldn’t parse results.  Still, FTP is in wide use and available on everything from servers to cell phone microchips because it does a good job at moving files.

The problem: You need to get a list of all files on a server using FTP.

The issue: FTP doesn’t provide a built in method to get a recursive list of all files, and provides two basic methods to get lists of files in the current directory.  LIST (WebRequestMethods.Ftp.ListDirectoryDetails) gets a list of files and details (formatting subject to the server’s configuration), and NLIST(WebRequestMethods.Ftp.ListDirectory) gets a “name list” which is the same list as LIST, but only returns filenames, and now details.

The result of LIST might look like:

09-18-08  02:11PM             18918524 readme.txt
09-18-08  02:13PM             18918676 Try 2 Parse Me!
05-04-09  02:16PM       <DIR>          I’m a folder

… or it might look like this:

-rwxrwxrwx   1 owner    group        18918524 Sep 18  2008 readme.txt
-rwxrwxrwx   1 owner    group        18918676 Sep 18  2008 Try 2 Parse Me!
drwxrwxrwx   1 owner    group               0 May  4 14:16 I’m a folder

… or something else entirely.  That’s the F in FTP – Fun!  (Or it could mean F**k’d).

The solution:  First I’m vetoing the use of regular expressions.  Experience has taught me there be dragons in that namespace and anytime you can avoid a regex, do.  Second, avoid recursive functions unless your in a functional programming language.  Here we go:

public static String[] FTPListTree(String FtpUri, String User, String Pass) {

List<String> files = new List<String>();
Queue<String> folders = new Queue<String>();
folders.Enqueue(FtpUri);

while (folders.Count > 0) {
String fld = folders.Dequeue();
List<String> newFiles = new List<String>();

FtpWebRequest ftp = (FtpWebRequest)FtpWebRequest.Create(fld);
ftp.Credentials = new NetworkCredential(User, Pass);
ftp.UsePassive = false;
ftp.Method = WebRequestMethods.Ftp.ListDirectory;
using (StreamReader resp = new StreamReader(ftp.GetResponse().GetResponseStream())) {
String line = resp.ReadLine();
while (line != null) {
newFiles.Add(line.Trim());
line = resp.ReadLine();
}
}

ftp = (FtpWebRequest)FtpWebRequest.Create(fld);
ftp.Credentials = new NetworkCredential(User, Pass);
ftp.UsePassive = false;
ftp.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
using (StreamReader resp = new StreamReader(ftp.GetResponse().GetResponseStream())) {
String line = resp.ReadLine();
while (line != null) {
if (line.Trim().ToLower().StartsWith("d") || line.Contains(" <DIR> ")) {
String dir = newFiles.First(x => line.EndsWith(x));
newFiles.Remove(dir);
folders.Enqueue(fld + dir + "/");
}
line = resp.ReadLine();
}
}
files.AddRange(from f in newFiles select fld + f);
}
return files.ToArray();
}

This function uses a two step process to parse a directory.  First a list of file and directory names is retrieved, then a second call is made to get the details of the files.  Yes, there are two calls to the server per directory – this allows a safe way to determine the directory name without heavy parsing of the details string.  The use of a Queue avoids the need for recursion. 

Notes:  This function doesn’t perform error checking and will throw an exception on any error – in my case this is the desired behavior, but YMMV.  Also, this method isn’t designed for speed – it’s fast enough for my solution (syncing folders across FTP with some custom logic tossed in), so I’m sure there is some room for improvement.

I posted this because I didn’t find anything in the .Net framework that did this already, and searching I found an overwhelming number of samples using regular expressions.  Regular expressions are tricky to get right, hard to read, a pain to test, and in my view are a weapon of last resort when a degree of false positives are acceptable.


Posted 05-04-2009 4:51 PM by Michael C. Neel
Filed under: ,

[Advertisement]

About The CodeBetter.Com Blog Network
CodeBetter.Com FAQ

Our Mission

Advertisers should contact Brendan

Subscribe
Google Reader or Homepage

del.icio.us CodeBetter.com Latest Items
Add to My Yahoo!
Subscribe with Bloglines
Subscribe in NewsGator Online
Subscribe with myFeedster
Add to My AOL
Furl CodeBetter.com Latest Items
Subscribe in Rojo

Member Projects
DimeCasts.Net - Derik Whittaker

Friends of Devlicio.us
Red-Gate Tools For SQL and .NET

NDepend

SlickEdit
 
SmartInspect .NET Logging
NGEDIT: ViEmu and Codekana
LiteAccounting.Com
DevExpress
Fixx
NHibernate Profiler
Unfuddle
Balsamiq Mockups
Scrumy
JetBrains - ReSharper
Umbraco
NServiceBus
RavenDb
Web Sequence Diagrams
Ducksboard<-- NEW Friend!

 



Site Copyright © 2007 CodeBetter.Com
Content Copyright Individual Bloggers

 

Community Server (Commercial Edition)