Showing posts with label semaphores. Show all posts
Showing posts with label semaphores. Show all posts

Tuesday, 3 March 2015

Lrzip 0.620

I finally found some time to give lrzip some much needed love again. Fortunately the last release, 0.616, proved very stable for the vast majority of workloads so there was never any great need to give it attention, though little things slowly cropped up, so I've accumulated all the bugfixes till now to release a new stable version, 0.620.

Freecode has long since died so here's the link to the (ghetto) home and download page:


http://lrzip.kolivas.org/

and the git source code page:

https://github.com/ckolivas/lrzip


Summary of changes:
- It would previously crash if trying to decompress from STDIN with a file that was too large to fit in ram which has now been fixed.
- It would previously fail to decompress files that were too large to fit in ram decompressed, now fixed.
- There were some scenarios lrzip would run out of ram when there was plenty to allocate, now fixed.
- Some other unix platforms would consider locking a mutex in one thread and releasing it in another a bug, so I've converted the use of those mutexes to anonymous semaphores.
- In order to maintain compatibility with platforms that don't properly support anonymous semaphores (OSX I'm looking at you), I've added the use of my custom fake semaphores as discussed here: unnamed-semaphores-and-pososx
- Some files would have their size reported wrongly with -i, now fixed.
- Added the ability to limit the use of ram with -m since lrzip happily uses all of it normally.
- Other minor changes and fixes for rare corner cases.


The changelog:
* Increase maxram when we abandon use of temporary input/output buffers
* Don't delete the tmpinfile when decompressing from stdin before allowing seek
to end to succeed in checking md5
* Use temporary file from read_seekto when STDIN will not fit in the ram input
buffer
* Remove unused read_i64 function
* Add message about issue tracker in BUGS
* Use a common exit path iin lrzip_compress/decompress and fix lr leak on
successful return
* Fix parenthesis placement inside of unlikely().
* Clear sa_mask and sa_handler before calling sigaction().
* Fix for lrzip -i. Decompressed size wrong
* added '-m' command line option
* Fix wrong README file being included in Makefile
* Pass strict sizes to decompress length, rounding up only the amount we're
allocating to not confuse decompression libraries
* Convert the thread locking to use cksems
* Add cksems to util.h
* Fix 'Failed to malloc ckbuf in hash_search2' with very large files.
* Round up compression and decompression buffers to page size since malloc will
allocate them that large anyway.
* Increase the compressed buffer size given to libzpaq in case of incompressible
data since it does not check if it's trying to write beyond the end of the
buffer.
* Provide a helper function to round a value up to the nearest page size for
malloc optimisations.

Monday, 2 September 2013

Unnamed semaphores and POSOSX

During the development of my bitcoin mining software, cgminer, I've used just about every synchronisation primitive due to it being heavily multithreaded. A few months back I used some semaphores and the first thing I reached for was the much more useful unnamed semaphores commonly in use today. Classic SYSV IPC semaphores are limited in number, require allocating of shared memory, stay in use till destroyed or the system rebooted etc. etc. that make them real awkward to use and far less flexible so I never even considered using them. For some reason, though, I had a vague memory of trying to use them on lrzip years ago and deciding not too. Well that memory came back to bite me.

Cgminer is cross platform, working reasonably well on various architectures with Linux, windows (via mingw32) and OSX mainly, though other brave souls have used it on all sorts of things. I've often heard OSX described as the "Fischer Price" unix, AKA "My first unix" because of its restricted subset of unix capabilities that it has, although I'm led to believe it claimed to have POSIX compliance at some stage - though I never really investigated it nor does it really matter since Linux is only POSIXy at best.

So the interesting thing was that I had written some code for cgminer which used unnamed semaphores and it compiled fine across the 3 main platforms, but it failed miserably when it came to working on OSX. Of note, the unnamed semaphore functions conform to POSIX.1-2001. All of the functions compiled perfectly fine, but the application refused to run properly, and finally when I got some of the OSX users to investigate further, every single unnamed semaphore function, such as sem_init, sem_post, sem_wait etc, would return a unique OSX error which when deciphered it was actually "Unimplemented feature". Quite amusing that to get POSIX compliance it only had to implement the functions, but not the actual features of those functions... You may go wild with speculation as to why this may be. This is why I coined the term POSOSX.

After toying with the idea of using SYSV semaphores and being disgusted at the thought, I finally decided that I should just implement really basic fake unnamed semaphores using pipes on OSX to imitate their behaviour.

Simplified code from cgminer for OSX follows (real code checks return values etc.):

struct cgsem {
    int pipefd[2];
};

typedef struct cgsem cgsem_t;
 

void cgsem_init(cgsem_t *cgsem)
{
    int flags, fd, i;

    pipe(cgsem->pipefd);

    /* Make the pipes FD_CLOEXEC to allow them to close should we call
     * execv on restart. */
    for (i = 0; i < 2; i++) {
        fd = cgsem->pipefd[i];
        flags = fcntl(fd, F_GETFD, 0);
        flags |= FD_CLOEXEC;
        fcntl(fd, F_SETFD, flags);
    }
}

void cgsem_post(cgsem_t *cgsem)
{
    const char buf = 1;

    write(cgsem->pipefd[1], &buf, 1);
}

void cgsem_wait(cgsem_t *cgsem)
{
    char buf;

    read(cgsem->pipefd[0], &buf, 1);
}

void cgsem_destroy(cgsem_t *cgsem)
{
    close(cgsem->pipefd[1]);
    close(cgsem->pipefd[0]);
}