-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
NVIDIA PERFORMANCE PRIMITIVES 
Version 3.2
RELEASE_NOTES.TXT
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

New Features
------------

  * Improved documentation. In addition to the HTML documentation we now provide
    a PDF file with cross-references.

  * A total of 167 new functions are being added in this release.
    The majority of those functions are basic signal- and image-processing functions
    mostly for data-initialization/transfer and arithmetic.
    
    Core NPP:
    - nppGetMaxThreadsPerBlock
    
    NPPI:
    - nppiMalloc_32f_C2
    - nppiSet_8u_C1MR
    - nppiSet_8u_C4MR
    - nppiSet_8u_AC4MR
    - nppiSet_8u_C4CR
    - nppiSet_16u_C1R
    - nppiSet_16u_C1MR
    - nppiSet_16u_C4R
    - nppiSet_16u_C4MR
    - nppiSet_16u_AC4R
    - nppiSet_16u_AC4MR
    - nppiSet_16u_C4CR
    - nppiSet_16s_C1R
    - nppiSet_16s_C1MR
    - nppiSet_16s_C4R
    - nppiSet_16s_C4MR
    - nppiSet_16s_AC4R
    - nppiSet_16s_AC4MR
    - nppiSet_16s_C4CR
    - nppiSet_32s_C1R
    - nppiSet_32s_C1MR
    - nppiSet_32s_C4R
    - nppiSet_32s_C4MR
    - nppiSet_32s_AC4R
    - nppiSet_32s_AC4MR
    - nppiSet_32s_C4CR
    - nppiSet_32f_C1MR
    - nppiSet_32f_C4R
    - nppiSet_32f_C4MR
    - nppiSet_32f_AC4R
    - nppiSet_32f_AC4MR
    - nppiSet_32f_C4CR
    - nppiCopy_8u_C1R
    - nppiCopy_8u_C4R
    - nppiCopy_8u_AC4R
    - nppiCopy_16u_C1R
    - nppiCopy_16u_C4R
    - nppiCopy_16u_AC4R
    - nppiCopy_16s_C1R
    - nppiCopy_16s_C4R
    - nppiCopy_16s_AC4R
    - nppiCopy_32s_C1R
    - nppiCopy_32s_C4R
    - nppiCopy_32s_AC4R
    - nppiCopy_32f_C1R
    - nppiCopy_32f_C4R
    - nppiCopy_32f_AC4R
    - nppiConvert_8u32f_C1R
    - nppiConvert_32f8u_C1R
    - nppiAddC_32f_C1R
    - nppiSubC_32f_C1R
    - nppiMulC_32f_C1R
    - nppiDivC_32f_C1R
    - nppiAbsDiffC_32f_C1R
    - nppiAddC_32fc_C1R
    - nppiSubC_32fc_C1R
    - nppiMulC_32fc_C1R
    - nppiDivC_32fc_C1R
    - nppiAdd_32s_C1R
    - nppiSub_32s_C1R
    - nppiMul_32s_C1R
    - nppiDiv_32f_C1R
    - nppiDiv_32s_C1R
    - nppiAbsDiff_32s_C1R
    - nppiLn_32f_C1R
    - nppiExp_32f_C1R
    - nppiMagnitude_32fc32f_C1R
    - nppiMagnitudeSqr_32fc32f_C1R
    - nppiEvenLevelsHost_8u_C1
    - nppiHistogramEvenGetBufferSize_8u_C1R
    - nppiHistogramRangeGetBufferSize_8u_C1R
    - nppiHistogramRange_8u_C1R
    - nppiReductionGetBufferHostSize_8u_C1R
    - nppiReductionGetBufferHostSize_8u_C4R
    
    - nppiGraphcut_32s8u
    - nppiGraphcutGetSize
    
    NPPS:
    - nppsMalloc_8u
    - nppsMalloc_16u
    - nppsMalloc_16s
    - nppsMalloc_16sc
    - nppsMalloc_32s
    - nppsMalloc_32u
    - nppsMalloc_32sc
    - nppsMalloc_32f
    - nppsMalloc_32fc
    - nppsMalloc_64s
    - nppsMalloc_64sc
    - nppsMalloc_64f
    - nppsMalloc_64fc
    - nppsFree
        
    - nppsSet_8u
    - nppsSet_16s
    - nppsSet_16sc
    - nppsSet_32s
    - nppsSet_32sc
    - nppsSet_32f
    - nppsSet_32fc
    - nppsSet_64s
    - nppsSet_64sc
    - nppsSet_64f
    - nppsSet_64fc
    
    - nppsZero_8u
    - nppsZero_16s
    - nppsZero_16sc
    - nppsZero_32s
    - nppsZero_32sc
    - nppsZero_32f
    - nppsZero_32fc
    - nppsZero_64s
    - nppsZero_64sc
    - nppsZero_64f
    - nppsZero_64fc
    
    - nppsCopy_32s
    - nppsCopy_32f
    - nppsCopy_8u
    - nppsCopy_16s
    - nppsCopy_16sc
    - nppsCopy_64s
    - nppsCopy_32sc
    - nppsCopy_32fc
    - nppsCopy_64sc
    - nppsCopy_64fc
    
    - nppsReductionGetBufferSize_8u
    - nppsReductionGetBufferSize_16s
    - nppsReductionGetBufferSize_16s_Sfs
    - nppsReductionGetBufferSize_16sc
    - nppsReductionGetBufferSize_16sc_Sfs
    - nppsReductionGetBufferSize_32s
    - nppsReductionGetBufferSize_32s_Sfs
    - nppsReductionGetBufferSize_32sc
    - nppsReductionGetBufferSize_32f
    - nppsReductionGetBufferSize_32fc
    - nppsReductionGetBufferSize_64s
    - nppsReductionGetBufferSize_64f
    - nppsReductionGetBufferSize_64fc
    - nppsSum_32f
    - nppsSum_32fc
    - nppsSum_64f
    - nppsSum_64fc
    - nppsSum_16s_Sfs
    - nppsSum_32s_Sfs
    - nppsSum_16s32s_Sfs
    - nppsSum_16sc_Sfs
    - nppsSum_16sc32sc_Sfs
    - nppsMax_16s
    - nppsMax_32s
    - nppsMax_32f
    - nppsMax_64f
    - nppsMin_16s
    - nppsMin_32s
    - nppsMin_32f
    - nppsMin_64f
    - nppsMinMaxGetBufferSize_8u
    - nppsMinMaxGetBufferSize_16s
    - nppsMinMaxGetBufferSize_16u
    - nppsMinMaxGetBufferSize_32s
    - nppsMinMaxGetBufferSize_32u
    - nppsMinMaxGetBufferSize_32f
    - nppsMinMaxGetBufferSize_64f
    - nppsMinMax_8u
    - nppsMinMax_16s
    - nppsMinMax_16u
    - nppsMinMax_32s
    - nppsMinMax_32u
    - nppsMinMax_32f
    - nppsMinMax_64f
    
  * Primitive renames:
    - nppiCannyGetSize -> nppiCannyGetBufferSize


-------------------------------------------------------------------------------
Revision History
-------------------------------------------------------------------------------

  09/2010 - Version 3.2
  07/2010 - Version 3.1
  03/2010 - Version 1.1
  12/2009 - Version 1.0
  08/2009 - Version 0.9 -- Initial public Beta


-------------------------------------------------------------------------------
More Information
-------------------------------------------------------------------------------

  For more information and help with NPP, please visit
  http://www.nvidia.com/npp

  Note: To run the SDK sample applications be sure to add the path to 
	.../NPP/SDK/common/npp/lib to your LD_LIBRARY_PATH environment variable.

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Previous Releases
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Version 3.1
-------------------------------------------------------------------------------

New Features
------------
  * None. 
    - This is purely an update compiled against the CUDA 3.1 Toolkit.
    - All restrictions from the previous release still apply.
    - The one notable change is that from now on NPP is versioned in 
      lock step with the CUDA Toolkit releases. That is the reason
      for the version bump from 1.1 directly to 3.1.
    - The names of the NPP dynamic libraries now reflect the full version
      suffix as their CUDA Toolkit counterparts.


-------------------------------------------------------------------------------
Version 1.1
-------------------------------------------------------------------------------

New Features
------------
  * Hardware Support
    - Added native support for Compute Capability 2.0 hardware (GF100)
    - GF100 class hardware supported (GeForce GTX 470 and GTX 480)

  * Software Support
    - CUDA 3.0
    - Microsoft Visual Studio 9.0 (2008)
    - NPP libraries now DLLs on Windows
    - Seperate NPP libraries for Debug Emulation mode

  * Improved documentation


Last Minute Changes
-------------------
  * Disabled Primitives:
    - YCbCrToRGB_8u_P3R
    - YCbCr422ToYCbCr420_8u_P3R
    - YCbCr422ToYCbCr411_8u_P3R
    - YCbCr420ToYCbCr422_8u_P3R
    - YCbCr420ToYCbCr411_P3P2R
    - ColorTwist32f_8u_P3R


