Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This needs to be updated to 3.0 (available) to support llvm 14.
|
|
|
|
|
|
Notable User Facing Changes
---------------------------
- support for LLVM 13
- CMake: Inter-Procedural Optimization is enabled on code of runtime library
(libpocl.so is compiled with -flto on systems that support it).
- LTTng tracing improved - more command types are traced, and also
some synchronous API calls (like clCreateBuffer) are traced.
- poclcc, tests and examples can be disabled with CMake options
- Valgrind support improved by making Valgrind aware of pocl's
reference counting of cl_* objects
- kernels which are called by kernels are now force-inlined
- Support for NetBSD.
- Support for Unix systems without libdl.
- PoCL can now (optionally) respond to SIGUSR2 by printing
some live debug information.
- improved SPIR support for CUDA devices
Notable Bug Fixes
-----------------
- Fixed a potential crash on Unix systems without sysfs mounted.
- Fixed compilation errors when building on macOS.
- Fixed POCL_FAST_INIT macro; POCL_INIT_LOCK must be invoked with only one argument.
- Fix bin/poclcc to not depend on OpenCL 2.0 symbols
- Fixed miscompilation in kernel loops with multiple conditionals with barriers in them.
Other
-----
- Add cmake options PARALLEL_COMPILE_JOBS, PARALLEL_LINK_JOBS to
use ninja's seperate compile and link job pools.
- Improve memory architecture, buffer migration and allocation.
Buffers are now allocated on a device when first used
(previously each buffer was allocated on every device in context).
- the single global LLVMContext was replaced with
multiple LLVMContexts, one per OpenCL cl_context.
OpenCL code can now be compiled in parallel
when using separate cl_contexts. This feature
is disabled by default since it significantly slowed
down PyOpenCL. This should be resolved by separating
LLVM compilation in their own threads in the future.
- a new OpenCL extension was added to PoCL: cl_pocl_content_size.
The extension allows the user to give optimization hint to PoCL,
which will be used internally by PoCL to optimize buffer transfers
between multiple devices.
|
|
All checksums have been double-checked against existing RMD160 and
SHA512 hashes
|
|
|
|
|
|
|
|
|
|
cpuinfo: check for fopen() failure when opening sysfs nodes to avoid segfault
https://github.com/pocl/pocl/pull/948
pocl_timing: fix detection of Unix monotonic clocks
https://github.com/pocl/pocl/pull/949
Fix detection of dlopen()/libdl
https://github.com/pocl/pocl/pull/950
|
|
|
|
|
|
OpenCL (Open Computing Language) is an open, royalty-free standard for
cross-platform, parallel programming of diverse accelerators found in
supercomputers, cloud servers, personal computers, mobile devices and embedded
platforms.
PoCL is a portable open source (MIT-licensed) implementation of the OpenCL
standard (1.2 with some 2.0 features supported). In addition to being an easily
portable multi-device (truely heterogeneous) open-source OpenCL implementation,
a major goal of this project is improving interoperability of diversity of
OpenCL-capable devices by integrating them to a single centrally orchestrated
platform.
|