Encapsulates CUDA API event objects and functions. More...

#include <event.hpp>

Public Member Functions
	event ()
	Default constructor. More...

	event (unsigned flags)
	Constructs an event with the given flags. More...

	~event ()
	Destructor. More...

void	record (cudaStream_t stream=0)
	Records an event. More...

cudaError_t	synchronize ()
	Wait until the completion of all device work preceding the most recent call to record(). More...

cudaError_t	query ()
	Query the status of all device work preceding the most recent call to record(). More...

float	operator- (event &other)
	Computes the elapsed time between another event and this event. More...

Static Public Member Functions
static float	elapsed_time (event &start, event &end)
	Computes the elapsed time between two events (in milliseconds with a resolution of around 0.5 microseconds). More...

Detailed Description

Encapsulates CUDA API event objects and functions.

CUDA events are useful for assessing the running time of device operations. This can be handy when optimizing kernel implementations or thread configurations.

This is just a thin wrapper around the appropriate cudaEventXXXXX functions in the CUDA API to provide access to the event functions in a more C++-like style. The documentation is shamelessly lifted from the official CUDA documentation (http://docs.nvidia.com/cuda/index.html).

For example, to get the running time of a kernel function:

ecuda::event start, stop;
// ... specify thread grid/blocks
start.record();
kernelFunction<<<grid,block>>>( ... ); // call the kernel
stop.record();
stop.synchronize(); // wait until kernel finishes executing
std::cout << "EXECUTION TIME: " << ( stop - start ) << "ms" << std::endl;

Definition at line 74 of file event.hpp.

Constructor & Destructor Documentation

ecuda::event::event ( )

inline

Default constructor.

Creates a default event object.

Definition at line 85 of file event.hpp.

ecuda::event::event ( unsigned flags )

inline

Constructs an event with the given flags.

As of now, valid flags specified by the CUDA API are:

cudaEventDefault: Default event creation flag.
cudaEventBlockingSync: Specifies that event should use blocking synchronization.
A host thread that uses synchronize() to wait on an event created with this flag will block until the event actually completes.
cudaEventDisableTiming: Specifies that the created event does not need to record timing data.
Events created with this flag specified and the cudaEventBlockingSync flag not specified will provide the best performance when used with cudaStreamWaitEvent() and query().

Parameters

flags A valid event flag.

Definition at line 100 of file event.hpp.

ecuda::event::~event ( )

inline

Destructor.

Deallocates the underlying CUDA event and destroys this object.

Definition at line 107 of file event.hpp.

Member Function Documentation

static float ecuda::event::elapsed_time	(	event &	start,
		event &	end
	)

inlinestatic

Computes the elapsed time between two events (in milliseconds with a resolution of around 0.5 microseconds).

If either event was last recorded in a non-NULL stream, the resulting time may be greater than expected (even if both used the same stream handle). This happens because the record() operation takes place asynchronously and there is no guarantee that the measured latency is actually just between the two events. Any number of other different stream operations could execute in between the two measured events, thus altering the timing in a significant way.

If record() has not been called on either event, then cudaErrorInvalidResourceHandle is returned. If record() has been called on both events but one or both of them has not yet been completed (that is, query() would return cudaErrorNotReady on at least one of the events), cudaErrorNotReady is returned. If either event was created with the cudaEventDisableTiming flag, then this function will return cudaErrorInvalidResourceHandle.

Parameters

start	Starting event.
end	Ending event.

Returns: The elapsed time between the two events in milliseconds.

Definition at line 173 of file event.hpp.

float ecuda::event::operator- ( event & other )

inline

Computes the elapsed time between another event and this event.

This is equivalent to elapsed_time( other, *this ).

Parameters

other The other event with which to compute the elapsed time.

Returns: The elapsed time between the two events in milliseconds.

Definition at line 187 of file event.hpp.

cudaError_t ecuda::event::query ( )

inline

Query the status of all device work preceding the most recent call to record().

This applies to the appropriate compute streams, as specified by the arguments to record().

If this work has successfully been completed by the device, or if record() has not been called on event, then cudaSuccess is returned. If this work has not yet been completed by the device then cudaErrorNotReady is returned.

Returns: cudaSuccess, cudaErrorNotReady, cudaErrorInitializationError, cudaErrorInvalidValue, cudaErrorInvalidResourceHandle, cudaErrorLaunchFailure

Definition at line 153 of file event.hpp.

void ecuda::event::record ( cudaStream_t stream = 0 )

inline

Records an event.

If stream is non-zero, the event is recorded after all preceding operations in stream have been completed; otherwise, it is recorded after all preceding operations in the CUDA context have been completed. Since operation is asynchronous, query() and/or synchronize() must be used to determine when the event has actually been recorded.

If record() has previously been called on event, then this call will overwrite any existing state in event. Any subsequent calls which examine the status of event will only examine the completion of this most recent call to record().

Parameters

stream Stream in which to record event.

Definition at line 123 of file event.hpp.

cudaError_t ecuda::event::synchronize ( )

inline

Wait until the completion of all device work preceding the most recent call to record().

This applies to the appropriate compute streams, as specified by the arguments to record(). If record() has not been called on event, cudaSuccess is returned immediately.

Waiting for an event that was created with the cudaEventBlockingSync flag will cause the calling CPU thread to block until the event has been completed by the device. If the cudaEventBlockingSync flag has not been set, then the CPU thread will busy-wait until the event has been completed by the device.

Returns: cudaSuccess, cudaErrorInitializationError, cudaErrorInvalidValue, cudaErrorInvalidResourceHandle, cudaErrorLaunchFailure

Definition at line 139 of file event.hpp.

The documentation for this class was generated from the following file:

include/ecuda/event.hpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation