[WEB4LIB] Huge file delivery
Alnisa Allgood
alnisa at nonprofit-tech.org
Fri Apr 29 13:52:00 EDT 2005
At 1:39 PM -0700 4/28/05, A. Bullen wrote:
>All--
>
>Forgive a naive question, but I have never had to deal with the
>following situation and I don't know how to it off. We will be in
>receipt of a very large GPS data set consisting of files that total 1
>terabyte all together; I think the individual data sets are 20-30 GB a
>piece.
>
>Does anyone have a suggestion how I can successfully distribute files
>this large on an on-demand basis? I can put them on servers that share a
>T-3, but I am not sure FTP can handle this size and scope of file
>transmission.
Just a question is, will the users need access to the entire data
set? or would it be reasonable to set-up an interface that allows
them to pull data from the data set based on a query? Also, I'm
assuming that the data doesn't have any HIPAA limitations on it.
I ask, just because, you could possible do dual distribution. For
those who actually need the full data set, they could download via
FTP. And hopefully they also have a T3 connection, or they will be
downloading for days. But you could place some basic access data next
to the file links (assuming people will come in through the web)
indicating approximate download times like 2hrs on a T3 connection,
10hrs on a basic DSL connection, etc.
But for those who just need to pull a subset of the data, you could
set-up a web interface for the data itself, that queries the data
set, displays records to the web or saves as comma separated text
file for download. Of course this is partially depended on the actual
data format, but I know a number of formats like SAS, and SPSS, and
like files that can be converted to access for web use.
If most people only need 1000 records out of 1 million or billion,
then something like this might work.
Alnisa
More information about the Web4lib
mailing list