Showing posts with label ftp. Show all posts
Showing posts with label ftp. Show all posts

Tuesday, March 20, 2012

Check FTP file date

First I want to thank everyone that has given help to me and everyone else with the issues involving migrating to 2005... Thanks alot..

Now for the problem. I am looking for (an not finding anything of help) to check the date of a file on an ftp server. A file always exists but once a month the day changes. I would just download the file and check it locally but the files are several hundred megs in size so that would be inefficient.

So is there anyway to do that?

On another note, can anyone point me to a good resource for learning the scripting language that SSIS uses?

This trhread sounds similar to your problem. See if it helps you.

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=993578&SiteID=1

|||

This is shooting straight from the hip, so I don't know how close this is for you:

The easiest way to get a file's timestamp is through the System.IO.FileInfo class. I can easily read lots of valuable information from files on our LAN using code like this:

Dim fi As System.IO.FileInfo
Dim CreationTime As Date

fi = New System.IO.FileInfo(<Path_to_file>)
CreationTime = fi.LastWriteTime

The .NET base classes are quite insensitive to "paths", so I can use UNC paths (like \\MyServer\MyShare\MyFolder\MyFile.txt) just as easily as referencing paths on my local machine ("C:\MyFolder\MyFile.txt").

FTP sites are usually protected by passwords and things, and your credentials are passed to the FTP server for processing before you can get at the files. If your files happen to be on your company's LAN, you may be able to use the FIleInfo class to get the info you need. If it's on someone else's network, or only accessible across the Internet, you'll probably need something a bit more heavy-duty.

It looks as though the SSIS FTP task is solely intended to grab files and bring 'em down to a network. I don't see any ability to pass an FTP command (like "LS" or "CWD") to an FTP server and read back a response. Maybe in a future version of SSIS, eh?

|||

the ftp connection (which does all the work for the FtpTask) is written in c++ using winInet call.

the managed System.Net.FtpWebRequest class could be used in a script task to send make a call to get the file info from the server, you would need to grab the connection info from the ftp connection manager and make your own connection though.

I've never use the class, perhaps someone out there has?

|||

Here is some code I found at: http://www.devasp.net/net/articles/display/246.html

The Code works to access an FTP site and write the detail file information (date, size, file/directory name) into a StreamReader (unfortunately, I am not knowledgeable enough to figure out how to get the info out of the StreamReader and do something with it!) I created the message box so I could see what the StreamReader returned...

It gets the information the OP wanted (date of file) from an FTP server. Some of the other guru's can probably assist with how to use the resulting stream (or tell us a better way to store the output from the ListDirectoryDetails method, beyond my capabilities right now)...

' Had to set Option Strict Off to allow:
' fwr = FtpWebRequest.Create(ftp://xxx.xxx.x.x)
' I don' t know how to change this statement to allow it to work with Option Strict On

Option Strict Off

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
' Must have System.Net to use FtpWebRequest
Imports System.Net
' Must have System.IO to use StreamReader
Imports System.IO

Public Class ScriptMain
Public Sub Main()

Dim fwr As FtpWebRequest
fwr = FtpWebRequest.Create(ftp://xxx.xxx.x.x) ' or ftp://ftp.somewhere.com
fwr.Credentials = New NetworkCredential("userid", "password")
fwr.Method = WebRequestMethods.Ftp.ListDirectoryDetails

Dim sr As New StreamReader(fwr.GetResponse().GetResponseStream())
Dim str As String = sr.ReadLine()
While Not str Is Nothing
'Console.WriteLine(str)
MsgBox(str)
str = sr.ReadLine()
End While

sr.Close()
sr = Nothing
fwr = Nothing

Dts.TaskResult = Dts.Results.Success

End Sub
End Class

|||

Change the type or cast it to allow Option Strict On, which I highly recommend.

Dim fwr As WebRequest

fwr = FtpWebRequest.Create("ftp://xxx.xxx.x.x") ' or

Dim ftp As FtpWebRequest

ftp = CType(FtpWebRequest.Create("ftp://xxx.xxx.x.x"), FtpWebRequest)

Sunday, March 11, 2012

Check data before insert

I need help thinking about this problem :-)

I have an SSIS pkg that automatically downloads financial extracts from an ftp site. Once the files are downloaded, I load the extract to a table. The table is first deleted before the insert, so that each time the table has "fresh" data (whatever was in the extract for that day).

Once the extract data is in the table, I load the data into yet another table that combines data from many tables. Simple enough.

Problem is, sometimes when I download the extract, it hasn't been updated yet, so I'm downloading an OLD extract. This old data then gets loaded into the first table. That's ok, because it doesn't really hurt anything. I can always delete the table and reload it if necessary.

The problem occurs when the old data goes from this table into the OTHER table. We don't want old data in this other table!

I need a way to check that I'm not loading the same data 2 days in a row into the OTHER table.

I realize that I might be able to solve this problem without using SSIS, but the solutions I've come up with so far aren't 100% satisfactory. I can use a query to check dates and that sort of thing, but it isn't foolproof, and would create problems if I need to manually force the process though, that is, if I need to override the date logic.

Anyways, I'm wondering if there's an SSIS approach to this problem... I can't rely on timestamps on the data files either. They're not accurate.

This is has been very perplexing to me.

Thanks

You have to determine a way to programatically decide if the data is "new" or not. Without that, I don't know how we can solve anything for you.|||

Yes, I have that.

Once I load the table, I check a "date entered" field against the data's timestamp column.

That is, if I load the data on 8/15, the data timestamp should be 8/14, or the day before.

If it's 8/13, it's out of date.

But what happens if the data load on 8/14 (with a data timestamp of 8/13) didn't run... and I need to re-load the data from 8/14 on 8/15?

Now, date entered is 8/15, but the data has a data timestamp of 8/13, which in this case IS correct, but my logic will reject the data because it isn't from the day before.

That's the problem, as I see it. If I hardcode logic into my stored proc, then I can't easily override it if I need to.

I was thinking maybe, instead, I could put the date logic in a separate SQL function, then call this function from an Execute SQL task in SSIS. Then have the function return a 0 (failure) or 1 (success). If it returns a 0, then do not perform any more steps in the package. If it returns a 1, then finish running the steps in the package.

Is this possible to do? If so, that could solve the problem.

Thanks

|||My idea being, if I need to override the logic, I can simply disable the Execute SQL task that contains the function. Then run the package manually.|||You could store the last successful date loaded and then read that in at the beginning of your package and then compare it to your data coming in. If the data coming in is greater than that date, then process it.|||

I would suggest the above mention of saving the successful date. Alternately, if you have a decent audit trail, you can use the date and time of the last run and do a comparison of the files timestamp

'

' Add your code here

'

Dim vars As Variables

Dim var As Variable

Dts.VariableDispenser.LockForRead("FileName")

Dts.VariableDispenser.GetVariables(vars)

Dim path As String = vars.Item(0).Value.ToString

If File.Exists(path) = False Then

' File does not exist, no attributes to read

Else

vars = Nothing

Dts.VariableDispenser.LockForWrite("FileModifiedDate")

Dts.VariableDispenser.LockForWrite("FileCreationDate")

Dts.VariableDispenser.GetVariables(vars)

vars("FileModifiedDate").Value = File.GetLastWriteTime(path)

vars("FileCreationDate").Value = File.GetCreationTime(path)

vars.Unlock()

End If

Dts.TaskResult = Dts.Results.Success

|||

I'm not sure if I've made my problem very clear :-)

As I mentioned in a previous post, I already have a way to determine if data is out of date or not on any given day.

The problem is that if I want to be able to override this logic if I need to. Please see my previous post that clearly explains what the problem is.

Q.) I am wondering if I can call a function in an Execute SQL task to have it return a value to the package. Depending on the value returned, I want to halt execution of the package.

Thanks

|||

Never mind... I realize my logic won't work anyways.

I just need to check the timestamps in the file names.........