1. 12 Jan, 2017 1 commit
  2. 05 Jan, 2017 1 commit
  3. 22 Nov, 2016 1 commit
  4. 30 Sep, 2016 9 commits
  5. 18 Aug, 2016 1 commit
  6. 23 Jun, 2016 1 commit
  7. 22 Jun, 2016 1 commit
  8. 21 Jun, 2016 1 commit
  9. 17 Jun, 2016 1 commit
  10. 04 Jun, 2016 1 commit
  11. 03 Jun, 2016 6 commits
  12. 23 May, 2016 3 commits
    • Earle F. Philhower, III's avatar
      575a06c4
    • Earle F. Philhower, III's avatar
      Support multihosted/threaded/ranged S3 operations · fe6c7bc9
      Earle F. Philhower, III authored
      Major rewrite to turn plugin into a multithreaded, high performance
      consumer of S3 services and features from Amazon and third-party
      object storage.
      
      Enable multithreaded, multipart/multirange operation on S3 objects:
        Enables support of parallel upload and download of portions of large
        objects, which can significantly improve performance on both Amazon S3
        and other object stores.
      
        The number of parallel threads used can be configured with the parameter
          S3_MPU_THREADS=##  (default 10 threads)
      
        The MB per part to use for each portion of the object is configured with
          S3_MPU_CHUNK=##  (default 64, units in MB)
      
        For storing objects, multipart PUT is used.  For retreiving objects from
        the archive, multi-ranged HTTP GETs are performed.
      
        Implements a random backoff up to 1s to S3_WAIT_TIME for each retry to
        avoid dogpile on network hiccups (which are often short, transient, and
        cause many threads to try and resubmit at the same time).
      
      Add S3_DEFAULT_HOSTNAME multiple host endpoint support:
        Add in multiple S3 host support.  Use a comma between host:ports to
        specify multiple hosts/ports to connect to in sequence (and in parallel
        when doing multipart uploads).  These hosts will be iterated over when
        connecting to the S3 service.
          S3_DEFAULT_HOSTNAME=ip1:port1,ip2:port2,ip3:port3;S3_AUTH_FILE=...
      
        Errors are logged to include the S3 host being used at the time of failed
        operations to help track down connecticity or other issues.
      
      Add S3_SERVER_ENCRYPT=[0|1] to enable at-rest S3 encryption:
        This sets the flag to request that S3 store the object data in encrypted
        form on-disk.  This is NOT the same thing as HTTPS, and only affects the
        data at-rest.  IF you need encrypted communications to S3 as well as
        encryption of the object data once there, please be sure to specify
          ...;S3_PROTO=HTTPS;S3_SERVER_ENCRYPT=1;...
        on the resource definition line.
      
        Note also that this is not supported by some local S3 appliances or
        software.  When unsupported, the S3 server will return errors for all
        upload operations, logged in rodsLog.
      
      Add S3_ENABLE_MD5 option to enable upload checksums on all PUTs:
        MD5 checksum calculations on all S3 PUT commands, setting the
        Content-MD5 header.
      
        Note that this requires reading each file effectively twice:  The
        first time the file is read to calculate the initial MD5 (because
        we need the MD5 to send the headers, we can't do this on the fly).
        Then the second time to actually send the file over the network to S3.
      
        If there is a MD5 mismatch it will be logged as an error by S3 and
        in rodsLog, and the iRODS system will be informed of archive failure.
      
      Testing parameters to support out-of-CI testing:
        Add S3BUCKET and S3PARAMS environment overrides for resource test.
        By default use the hardcoded values in prior version for bucketname
        and the parameters (authentication, threads, proto, etc.).  If
        the environment variable S3BUCKET is set, overide the bucket we use,
        and if S3PARAMS are set override the entire resource configuration
        string.
        For example:
         S3BUCKET=user S3PARAMS='S3_DEFAULT_HOSTNAME=192.168.122.128:443;"\
         "S3_AUTH_FILE=/tmp/s3;S3_PROTO=HTTP;S3_RETRY_COUNT=15;S3_WAIT_TIME_SEC=1;"\
         "S3_MPU_THREADS=10;S3_MPU_CHUNK=16' "\
         python run_tests.py --run_specific_test test_irods_resource_plugin_s3
      
      Add ERROR_INJECT testmode:
        Adds a -DERROR_INJECT compile-time option which will cause the specified
        call to pread or pwrite in a callback to fail.  This allows testing the
        retry/recovery mechanisms.
      
      Removed dead/obsolete code, fix GCC -pedantic -Wall warnings:
        Clean up the code of most global variables, dead variables and functions,
        and GCC warnings in -pedantic mode.
      
      Rework of error/retry handling, obey requested number of retries for all ops:
        All operations should now retry their operations on transient S3 errors,
        with the specified delay time and maximum retries.
      fe6c7bc9
    • Earle F. Philhower, III's avatar
      Support multihosted/threaded/ranged S3 operations · efeae974
      Earle F. Philhower, III authored
      Major rewrite to turn plugin into a multithreaded, high performance
      consumer of S3 services and features from Amazon and third-party
      object storage.
      
      Enable multithreaded, multipart/multirange operation on S3 objects:
        Enables support of parallel upload and download of portions of large
        objects, which can significantly improve performance on both Amazon S3
        and other object stores.
      
        The number of parallel threads used can be configured with the parameter
          S3_MPU_THREADS=##  (default 10 threads)
      
        The MB per part to use for each portion of the object is configured with
          S3_MPU_CHUNK=##  (default 64, units in MB)
      
        For storing objects, multipart PUT is used.  For retreiving objects from
        the archive, multi-ranged HTTP GETs are performed.
      
      Add S3_DEFAULT_HOSTNAME multiple host endpoint support:
        Add in multiple S3 host support.  Use a comma between host:ports to
        specify multiple hosts/ports to connect to in sequence (and in parallel
        when doing multipart uploads).  These hosts will be iterated over when
        connecting to the S3 service.
          S3_DEFAULT_HOSTNAME=ip1:port1,ip2:port2,ip3:port3;S3_AUTH_FILE=...
      
        Errors are logged to include the S3 host being used at the time of failed
        operations to help track down connecticity or other issues.
      
      Add S3_SERVER_ENCRYPT=[0|1] to enable at-rest S3 encryption:
        This sets the flag to request that S3 store the object data in encrypted
        form on-disk.  This is NOT the same thing as HTTPS, and only affects the
        data at-rest.  IF you need encrypted communications to S3 as well as
        encryption of the object data once there, please be sure to specify
          ...;S3_PROTO=HTTPS;S3_SERVER_ENCRYPT=1;...
        on the resource definition line.
      
        Note also that this is not supported by some local S3 appliances or
        software.  When unsupported, the S3 server will return errors for all
        upload operations, logged in rodsLog.
      
      Add S3_ENABLE_MD5 option to enable upload checksums on all PUTs:
        MD5 checksum calculations on all S3 PUT commands, setting the
        Content-MD5 header.
      
        Note that this requires reading each file effectively twice:  The
        first time the file is read to calculate the initial MD5 (because
        we need the MD5 to send the headers, we can't do this on the fly).
        Then the second time to actually send the file over the network to S3.
      
        If there is a MD5 mismatch it will be logged as an error by S3 and
        in rodsLog, and the iRODS system will be informed of archive failure.
      
      Testing parameters to support out-of-CI testing:
        Add S3BUCKET and S3PARAMS environment overrides for resource test.
        By default use the hardcoded values in prior version for bucketname
        and the parameters (authentication, threads, proto, etc.).  If
        the environment variable S3BUCKET is set, overide the bucket we use,
        and if S3PARAMS are set override the entire resource configuration
        string.
        For example:
         S3BUCKET=user S3PARAMS='S3_DEFAULT_HOSTNAME=192.168.122.128:443;"\
         "S3_AUTH_FILE=/tmp/s3;S3_PROTO=HTTP;S3_RETRY_COUNT=15;S3_WAIT_TIME_SEC=1;"\
         "S3_MPU_THREADS=10;S3_MPU_CHUNK=16' "\
         python run_tests.py --run_specific_test test_irods_resource_plugin_s3
      
      Add ERROR_INJECT testmode:
        Adds a -DERROR_INJECT compile-time option which will cause the specified
        call to pread or pwrite in a callback to fail.  This allows testing the
        retry/recovery mechanisms.
      
      Removed dead/obsolete code, fix GCC -pedantic -Wall warnings:
        Clean up the code of most global variables, dead variables and functions,
        and GCC warnings in -pedantic mode.
      
      Rework of error/retry handling, obey requested number of retries for all ops:
        All operations should now retry their operations on transient S3 errors,
        with the specified delay time and maximum retries.
      efeae974
  13. 09 Mar, 2016 2 commits
  14. 15 Feb, 2016 1 commit
  15. 03 Feb, 2016 1 commit
  16. 28 Jan, 2016 1 commit
  17. 05 Nov, 2015 1 commit
  18. 25 Sep, 2015 1 commit
  19. 23 Sep, 2015 3 commits
  20. 27 Jul, 2015 1 commit
  21. 14 Jul, 2015 1 commit
  22. 29 Apr, 2015 1 commit