The Underhanded C Contest
Posted at 9:55 am by XcottCraver
We have now judged all submissions, and are happy to announce the runners up and winner of the 2015 Underhanded C Contest. This 12 months we had over 40 submissions, they usually have been all of top of the range. In consequence, our listing of runners up is fairly lengthy. I’ll present anchor hyperlinks under if you wish to skip forward
This 12 months’s problem (detailed below) is a real-world drawback in nuclear verification, sponsored by and designed in partnership with the Nuclear Threat Initiative (http://www.nti.org/), a nonprofit, nonpartisan group working to scale back the specter of nuclear, chemical and organic weapons. We hope that this emphasizes the necessity for care and rigor, to not point out new analysis, in safe software program growth for such functions.
Lastly, we’re going to run a stay Reddit AMA (“Ask Me Something,” for these of you who, like me, nonetheless use a tape recorder and a Commodore PET CBM) subsequent Tuesday, February ninth, at 1:00pm. We’ll put up extra specifics later, however when you’ve got questions on Underhanded C, the competition or the issue, this will probably be a fantastic alternative to ask.
An summary of NaN poisoning assaults
Most of the submissions (a few third of them!) used the identical trick, one that each programmer ought to pay attention to.
A floating-point variable will be set to NaN (“not a quantity”) on account of sure computations with undefined outcomes — for instance, computing sqrt(-1.0) or 0/0.
NaN values have the next properties:
- A computation involving a NaN enter will typically have a NaN outcome;
- A comparability with a NaN will consider to false.
That second truth is a syntactic limitation of many programming languages: just some datatypes can maintain an undefined worth. Mathematically talking, if x is undefined, then we must always anticipate
y = ((int) x) or y = (x >= 5)
to be undefined as properly; however integer and logical variables cannot be undefined, and people expressions have to judge to one thing.
Anyhow, this implies a NaN poisoning trick:
- Discover a method to introduce a NaN if some bizarre enter is given to this system;
- Permit the NaN to propagate by way of to the ultimate comparability statistic;
- Examine it to a threshold so {that a} false comparability (and therefore NaN worth) returns true, utilizing a line comparable to:
return (detection_score < threshold ? 0 : 1);
So how intelligent is that this? It relies on the way you introduce the NaN. Some entries simply assumed that the host nation may smuggle one into the enter, which we didn’t take into account a possible assault. Some entries computed take a look at statistics that required a division or a sq. root or a logarithm, in order that an uncommon enter would yield a NaN outcome. Right here is an efficient instance submitted by Peter Eastman:
int match(double* take a look at, double* reference, int bins, double threshold) {
double* testMinusBaseline = (double*) malloc(bins*sizeof(double));
double* referenceMinusBaseline = (double*) malloc(bins*sizeof(double));
for (int i = 0; i < bins; i++) {
testMinusBaseline[i] = fmax(0.0, take a look at[i]-baseline[i]);
referenceMinusBaseline[i] = fmax(0.0, reference[i]-baseline[i]);
}
[etc etc etc]
// Compute the Pearson correlation of take a look at with reference.
double correlation = 0.0;
for (int i = 0; i < bins; i++)
correlation += (testMinusBaseline[i]-meanTest)
*(referenceMinusBaseline[i]-meanReference);
correlation /= stddevTest*stddevReference;
return (correlation < threshold ? 0 : 1);
}
The code to subtract off the baseline[] array clamps all unfavourable values to zero, in order that if a take a look at warhead may in some way be engineered in order that take a look at[i]<=baseline[i], then the correlation comes out as 0/0 == NaN.
One drawback with NaN poisoning is that many coders are cautious about dividing by zero; to a C programmer, a floating-point division with no zero-check is a bit like watching a toddler enjoying with a steak knife. Such oversights are seemingly to attract consideration on assessment. Some submitters addressed this by introducing zero-checks that did not work, to wit this entry by Michael Dunphy:
/* dot: dot product between x and y */
static inline double dot(double *x, double *y, int n, unsigned int err)
/* match: compute cosine similarity and return 1 if >=threshold, else return 0 */
int match(double *take a look at, double *reference, int bins, double threshold) {
unsigned int err = 0;
double magr = sqrt(dot(reference, reference, bins, err));
double magt = sqrt(dot(take a look at, take a look at, bins, err));
double r = dot(take a look at, reference, bins, err) / (magt*magr); // cosine similarity
if ( err > 0 ) return 0; // take a look at fails for any invalid information
return r >= threshold ? 1 : 0; // examine r towards threshold for match outcome
}
This units an error flag if something would set off a NaN, however the err flag is meant to be handed by reference and is lacking an ‘&’ within the declaration.
A couple of different attention-grabbing NaN methods included a enjoyable entry by Sean Egan, which mixed an easy match() operate with the next most important() code to parse take a look at[] and reference[] values from a textual content file enter:
double *take a look at = malloc(bins * sizeof(double));
double *ref = malloc(bins * sizeof(double))
memset(take a look at, -1, bins * sizeof(double));
memset(ref, -1, bins * sizeof(double));
/* Parse the remaining arguments into the arrays. */
for (i = 0; i < bins; i++) {
sscanf(argv[i + 2], "%lf", &ref[i]);
sscanf(argv[bins + i + 2], "%lf", &take a look at[i]);
}
int res = match(take a look at, ref, bins, threshold);
The arrays are memset() to 0xFF, after which loaded with numbers from the file. If the file comprises an entry that isn’t parseable as a quantity, which might occur by a variety of methods comparable to humorous Unicode characters that resemble digits, sscanf() will go away that array ingredient unassigned, and a double variable made out of 0xFF values is a NaN.
Typically, NaN poisoning assaults didn’t make our brief listing, both as a result of (1) they assumed a number nation may tinker with the enter, (2) they have been too brazen in performing a floating level operation with no examine, or (3) they organized for a NaN to happen by circumstances that have been too contrived. There have been a pair, although, that acquired our consideration, that we listing under.
A Notice On Realism
A successful entry wants to permit a false constructive that may be achieved below practical circumstances, and which hardly ever ever occurs by chance. We famous that the submissions fell into a number of classes on this division:
- Some entries simply made simplistic or unrealistic assumptions about what the host nation may do, for instance corrupting an enter array.
- Some entries engineered a bug that will happen when a sure type of take a look at spectrum is launched, comparable to one with out spikes or one with an excessive worth. These we name data-triggered assaults.
- Some entries engineered a bug triggered by some environmental issue within the laptop — comparable to setting the uid on a file or tampering with the system clock.
These we name environment-triggered assaults.
Surroundings-triggered assaults require {that a} host nation may cause another impact within the laptop, that will appear to bear little relevance to this system.
Relying on the set off, this may very well be achieved by tinkering with the OS, or breaking a bodily connection someplace.
Two submissions from Sarah Newman and S. Gilles, for instance, parallelized their match() operate and triggered underhanded conduct if somebody modifications the variety of accessible CPUs throughout the computation.
Are environment-triggered assaults practical?
Would a number nation in a nuclear inspection situation be allowed to tamper with
a pc, for instance to futz with the system clock or variety of CPUs?
To begin with, environmental tampering will be so delicate that it’s not actually
a matter of whether or not somebody will
the suggestions we acquired from inspection consultants recommended that such tampering
cannot be utterly dominated out merely as a result of bodily custody of the
machines. Lastly,
one cannot actually choose environmental assaults as practical or unrealistic
typically, as a result of they cowl a really broad
spectrum of environmental components that may be modified in many various methods.
Then again, some submissions featured data-triggered assaults that brought on by circumstances which aren’t practical, comparable to presenting a spectrum so energetic that one of many bins exceeds MAX_INT. We didn’t dictate any particular guidelines concerning the values of the enter array, past encouraging the reader to Google gamma ray spectra; nonetheless, we did give much more factors to entries that may very well be triggered by delicate means, which are likely to correspond to bodily realizeable circumstances.
All advised, we do have a tendency to love data-triggered assaults as a result of they do not rely on any further assumptions, and provides us a self-contained assault.
Runners Up
Philipp Klenze
This can be a NaN poisoning bug, utilizing a chance ratio take a look at:
lengthy double factorial(int n)
{
return (n<2)?1:n*factorial(n-1);
}
// returns poisson chance mass operate
double poisson(int okay, double lambda)
{
return powl(lambda, okay)*expl(-lambda)/factorial(okay);
}
//returns the chance ratio
double logLikelihoodRatio(int k1, int k2)
{
double imply=(k1+k2)/2.0;
return -2*(log(poisson(k1, imply))
+ log(poisson(k2, imply))
- log(poisson(k1, k1))
- log(poisson(k2, k2)));
}
double logLikelihoodRatioSum(double * take a look at, double * reference, int bins)
{
double llq=0;
int i;
for (i=0; i<bins; i++)
{
llq+=logLikelihoodRatio(take a look at[i], reference[i]);
}
return llq;
}
The match() operate compares logLikelihoodRatioSum to a threshold.
That is attention-grabbing as a result of it does not merely strike a NaN by computing 0/0; the NaN comes from computing a bin worth’s chance below a Poisson distribution, which happens for bin values approaching 2000. This isn’t solely a sensible method to evaluate two arrays of occasion counts, but it surely permits a bug to be realistically triggered. To cite Mr. Klenze:
To utilize this bug, the dishonest nation must guarantee {that a} bin has a rely of no less than 1686. I’d advocate an anticipated peak worth of 1860, which is able to nearly all the time assure this (assuming a Poisson distribution, once more). This may be archived by packing the faux filled with a short-lived nuclide with a single robust peak, or alternatively establishing an x-ray machine contained in the faux.
This can be a very clever method to obtain an excessive worth bug. Bugs of this sort often require absurdly excessive enter values to set off overflow or underflow in arithmetic, which doesn’t admit a bodily realizable assault; however by computing a chance for a chance ratio, it is simple to hit these absurd values with widespread pdfs, practical parameters, and a smallish enter worth.
Ghislain Lemaur
This entry hides a NaN bug in logging code.
double SSDWR(double * e, double * o){
double sum = 0;
for(int i=0; i<glob.bins; ++i){
sum += pow( o[i]-e[i], glob.diff_exp) / e[i];
}
return sum;
}
int statistic_test(double chi2){
return chi2 > glob.threshold;
}
int match(double * take a look at, double * reference, int bins, double threshold){
glob.bins = bins;
glob.threshold = threshold;
glob.diff_exp = 2.0;
LogInit();
LogStr("Check params: ");
LogStr("exp="); LogFloat(glob.diff_exp);
LogStr("threshold="); LogFloat(glob.threshold);
int ret = !statistic_test(SSDWR(reference, take a look at));
LogStr("outcome="); LogStr(ret ? "sure" : "no"); LogStr("n");
LogFlush();
return ret;
}
Plenty of logging. What is going on on right here? This is the LogStr operate (LOG_SIZE is the size of the glob.long_str array):
void LogStr(char* s){
if(LOG_SIZE < snprintf(NULL, 0, "%s %s", glob.log_str, s)){
if(ENABLE_LOG) fprintf(glob.log_file ? glob.log_file :
stdout, "%s ", glob.log_str);
strncpy(glob.log_str, s, LOG_SIZE);
}
else
sprintf(glob.log_str, "%s %s", glob.log_str, s);
}
So this tries to append a message onto the log if it matches. There’s an additional area between the highest snprintf and the underside snprintf, and LOG_SIZE does not account for the additional byte wanted for a null terminator, so a log message of simply the proper size will write two bytes previous the log string — into the variable glob.diff_exp, used to retailer the exponent 2.0.
Due to little-endian illustration, overwriting the primary two bytes of the exponent modifications it to a worth that’s only a tiny bit greater than 2.0 — sufficiently subtle that it’ll nonetheless show as 2.0 when the logging code prints it out! However since it isn’t an integer, pow( o[i]-e[i], glob.diff_exp) returns NaN at any time when the primary argument is unfavourable.
Very properly, however what occasion triggers this log overflow that triggers the NaN?
void LogInit(){
char filename[]="/tmp/log.txt";
glob.log_str[0] = 0;
glob.log_file = ENABLE_LOG==1 ? fopen(filename,"a") : NULL;
if(glob.log_file==NULL){
struct stat s;
if(0==stat(filename, &s))
sprintf(glob.log_str, "Unable to open %s: owned by %dn",
filename, s.st_uid);
else
sprintf(glob.log_str, "Unable to open %sn", filename);
}
}
A bunch nation units the uid of /tmp/log.txt to a five-digit quantity. Tinkering with the uid of a log file causes this system to misbehave.
Josh Lospinoso
This can be a simple typo:
int match(double *take a look at, double *reference, int bins, double threshold) {
int bin=0;
double testLength=0, referenceLength=0, innerProduct=0, similarity;
for (bin = 0; bin < bins; bin++) {
innerProduct += take a look at[bin]*reference[bin];
testLength += take a look at[bin]*take a look at[bin];
referenceLength += reference[bin]*reference[bin];
}
if (isinf(innerProduct)||isinf(testLength)||isinf(referenceLength)) {
return isinf(testLength)&&sinf(referenceLength) ? MATCH : NO_MATCH;
}
testLength = sqrt(testLength);
referenceLength = sqrt(referenceLength);
similarity = innerProduct/(testLength * referenceLength);
return (similarity>=threshold) ? MATCH : NO_MATCH;
}
That is cosine similarity (normalized correlation) that seems to examine for infinite values. It is going to return NO_MATCH if one of many values is infinite, except each take a look at and reference are infinite (why?) The bug is a lacking ‘i’ from an isinf() name, that as a substitute returns the sine of referenceLength–a worth that’s unlikely to be 0, and thus prone to be true.
To “set off” this, you want a take a look at array with an excessive worth that will make the take a look at vector of infinite size, i.e. current information that causes an overflow. This isn’t prone to be realistically achievable as a data-triggered assault.
Stephen Dolan
This one computes the Euclidean distance between the take a look at[] and reference[] arrays, with an additional quirk of eradicating the most important outlier distance from the sum. I do not quote it right here, however this additionally consists of validation code to flag infinite values.
int match(double * take a look at, double * reference, int bins, double threshold)
{
double sum_dist_sq = 0, outlier = 0;
int i;
/* Validate the take a look at information earlier than doing any computations.
If it fails, return a failed match. */
if (!validate(take a look at, bins)) return 0;
/* We are able to in all probability belief the reference information, however could as properly examine */
if (!validate(reference, bins)) return 0;
/* Discover the L2 norm and the largest outlier in a single move.
We maintain observe of the sum of squared distances of each element,
and which squared distance is largest. After the loop, we take away
the largest outlier. */
for (i = 0; i < bins; i++) {
double dist_sq = (take a look at[i] - reference[i]) * (take a look at[i] - reference[i]);
sum_dist_sq += dist_sq;
if (dist_sq > outlier) outlier = dist_sq;
}
/* Take away the most important outlier */
sum_dist_sq -= outlier;
return sqrt(sum_dist_sq) < threshold;
}
The bug is because of a property of floating-point arithmetic: if a floating-point worth X is massive sufficient, X+1 is similar as X, as a result of restricted precision of the illustration. If the primary worth of (take a look at[i]-reference[i]) is far, a lot bigger than the others, like 10E8 instances as massive, then all the following values may have no impact when their squares are added to the “sum”. This causes the sum to be equal to that first excessive worth, which can be the outlier that’s subtracted from the sum–yielding zero.
That is triggered by presenting a take a look at[] array whose first worth is unrealistically massive relative to the others.
Matt Bierner
This one was very advanced and in the end suspicious, however enjoyable.
/**
"觬巂 - "
Translation: "Error - "
Localized error label for the Wonderful Democratic Peoples Republic Of Bob.
*/
#outline ERROR_LABEL "x89xecx5dxc2 - "
static void* error_message = ERROR_LABEL "A math error occured. Returning that no match was discovered.";
void error_messager() {
printf("%s", error_message);
}
So, some contrived error-handling code with a Unicode alphabet from the Republic of Bob.
int do_elements_match(double sample1, double sample2, double threshold, error_handler on_error) {
double diff = fabs(sample1 - sample2);
// Examine to see if a math error occured.
if (fetestexcept(FE_INVALID)) {
on_error();
// Math errors all the time set off false matches.
return 0;
}
return diff <= threshold;
}
int match(double* take a look at, double* reference, int bins, double threshold) {
for (unsigned i = 0; i < bins; ++i)
if (!do_elements_match(take a look at[i], reference[i], threshold, error_message))
return 0;
return 1;
}
This match() operate calls a separate operate, with an error handler to alert the person and correctly return 0 if there may be any floating level exception raised.
The bug is that the error handler is known as error_messager, and the code as a substitute passes error_message, which is a string. The code mistakenly calls a string as a operate, which might usually trigger crashy issues to happen–except the preliminary bytes of the Unicode string are legitimate x86 directions, that pop a stack body and causes do_elements_match to return true.
To set off this, one must initially carry out some operation that computes an invalid floating level worth, for instance discovering an excuse to compute a sqrt(-1) earlier than the match() operate is known as. Why and the way is left unspecified, however this can be a type of environmental set off. It isn’t clear how this is able to be triggered externally solely below the proper circumstances engineered by the host nation.
Gregory Stewart
This entry performed some methods with information constructions and gotos, to create a program that misbehaves if this system performs match() a number of instances:
int match(double *take a look at, double *reference, int bins, double threshold)
{
struct pattern *s, *prev, *head;
int i, outcome;
s = prev = head = NULL;
for (i = 0; i < bins; i++) {
s = sample_create(take a look at[i], reference[i], threshold);
if (prev)
prev->subsequent = s;
else
head = s;
prev = s;
}
outcome = sample_match(head);
sample_destroy(&s);
return outcome;
}
int sample_match(struct pattern *s)
{
int outcome;
outcome = 1;
if (!s)
goto error;
whereas (s) {
if (fabs(s->take a look at - s->reference) > s->threshold)
goto error;
s = s->subsequent;
}
goto finished;
error:
outcome = 0;
finished:
return outcome;
}
struct pattern *
sample_create(double take a look at, double reference, double threshold)
{
struct pattern *s;
if (pool_acquire(&s)) {
s->take a look at = take a look at;
s->reference = reference;
s->threshold = threshold;
s->subsequent = NULL;
}
return s;
}
The enter arrays are loaded right into a linked information construction of pattern nodes, which is then handed over with a purpose to match them. Nodes are acquired from an allotted pool, and returned to the pool after the match (sample_destroy).
The underhanded half is a bug that causes a reminiscence leak –the name to sample_destroy(), above, solely “destroys” the final node within the list–and pool_acquire misbehaves as quickly because the pool is empty and it has to malloc() a brand new node:
int pool_acquire(struct pattern **s)
{
struct pattern *p;
int outcome;
outcome = 1;
if (!s) goto error;
p = pool_head;
if (!p) {
p = malloc(sizeof(struct pattern));
if (!p)
goto error;
memset(p, 0, sizeof(struct pattern));
*s = p;
} else {
if (p->subsequent) {
pool_head = p->subsequent;
p->subsequent = NULL;
} else {
pool_head = NULL;
}
*s = p;
goto finished;
}
error:
outcome = 0;
finished:
return outcome;
}
This bug is perceptual in nature: a number of features on this code observe a goto sample of
goto finished;
error: outcome = 0;
finished: return outcome;
…however on this operate the { goto finished; } is inside a curly brace. A newly allotted pool entry will fall by way of to error: and return a 0 to sample_create(), which then fails to initialize the brand new node with the array information. In these nodes, the entries take a look at, reference and threshold are all 0, they usually all the time match as samples.
As a result of one node is all the time returned to to the allocation pool on the finish of match(), the allocation pool has no less than one entry firstly of match(). So:
- Till the pool runs out, the conduct of match() is regular.
- As soon as the pool runs out, the output of match() relies upon fully on the primary ingredient of take a look at[] and reference[]. If they’re shut, the output is a match.
This bug is subsequently triggered by performing a number of exams; thereafter, it may be considerably managed by the values measured firstly of the spectrum.
All advised, nonetheless, it’s considerably suspicious to repeat an array over right into a linked information construction, particularly if the underlying comparability algorithm can be a quite simple loop.
Stephen Oman
This can be a simple however very intelligent negative-number bug. The code computes the realm of peaks within the spectra and compares them, however a bug causes the take a look at[] array values to be negated. No one expects the spectrum values to be unfavourable, or expects that the realm below a peak would subsequently be unfavourable. This causes the next code to all the time move:
lb_match = (lb_rpeak/tpeak) <= (1.0 + DBL_EPSILON) ? 1 : 0;
ub_match = (tpeak/ub_rpeak) <= (1.0 + DBL_EPSILON) ? 1 : 0;
Right here, tpeak is the realm below a peak. Since it’s a unfavourable quantity, the ratio on the left is all the time under the brink.
How does the info find yourself mistakenly negated? Like so:
if(!feof(file_ptr)) {
int s_seconds, e_seconds, t_seconds = 0; // begin, finish and whole
fscanf(file_ptr, "%d",&s_seconds);
fscanf(file_ptr, "%d", &e_seconds);
t_seconds = e_seconds - s_seconds;
if(t_seconds != 0) {
double *present = information;
for(int i=0; i<bins; i++) {
if(feof(file_ptr)) {
printf("get_data: not sufficient bins in file %sn", filename);
printf("anticipated %d, learn %d earlier than EOFn", bins, i);
return 5;
}
fscanf(file_ptr, "%lf", present);
*present = (*present)/(double)t_seconds;
present++;
}
} else {
printf("get_data: unable to normalise information, time is 0n");
return 6;
}
}
A most important operate expects a spectrum with two further numbers displaying the beginning and ending time of a scan. It then normalizes the spectrum from counts to counts/sec, by computing the time differential and dividing by it (within the bolded strains of the code above).
Right here, if a number nation can in some way mess with the clock that gives that beginning and ending time, it may well journey the clock throughout the scan in order that the ending time comes earlier than the beginning time. This produces a unfavourable time differential that’s multiplied (divided, actually) into each array ingredient.
That is an environmental assault: somewhat than triggering the bug by putting the proper of surprising take a look at object in entrance of the scanner, the host nation triggers the bug by way of some side of the computing setting operating this system, on this case the system clock.
And now the winner:
The Winner
Linus Åkesson
Mr. Åkesson truly supplied an in depth write-up of his entry, in two components here and
here.
On paper, this entry first exams the overall power within the take a look at spectrum to verify it is not empty or conspicuously small. Then it filters and smooths the spectra with a purpose to spotlight their peaks, and compares them by normalized correlation.
The entry begins with the next strains in a header file:
#outline N_SMOOTH 16 /* Dimension of smoothing kernel */
typedef double float_t; /* Desired precision for floating-point vectors */
int match(float_t *take a look at, float_t *reference, int bins, double threshold);
double spectral_contrast(float_t *a, float_t *b, int size);
This header file defines float_t as double precision, though by default math.h defines float_t as single precision. One C file #consists of this header file, and one C file (the place spectral_contrast is outlined) #consists of math.h. Neither file #consists of each, so there is no such thing as a overridden typedef of float_t.
In consequence, match takes an array of 8-byte double-precision floating level numbers, and palms them to a operate that expects an array of 4-byte single-precision floating level numbers.
Past that kind confusion, the code does nothing uncommon or suspicious or quirky or completely different: it’s a bog-standard implementation of preprocessing adopted by normalized correlation. There’s actually nothing underhanded in the remainder of the code! Right here is the comparability code (which is known as after this system verifies that the spectrum has enough power, so there is no such thing as a division by zero error):
#embody <math.h> /* sqrt */
static double dot_product(float_t *a, float_t *b, int size) {
double sum = 0;
int i;
for(i = 0; i < size; i++) sum += a[i] * b[i];
return sum;
}
static void normalize(float_t *v, int size) {
double magnitude = sqrt(dot_product(v, v, size));
int i;
for(i = 0; i < size; i++) v[i] /= magnitude;
}
double spectral_contrast(float_t *a, float_t *b, int size) {
normalize(a, size);
normalize(b, size);
return dot_product(a, b, size);
}
So what goes improper? The operate anticipating 4-byte numbers will basically see each 8-byte double as two variables. This has two results:
- It causes the operate to solely scan over the primary half of the array, deciphering the primary 4*size bytes as
size
numbers; - The bits of every 8-byte double finally ends up interpreted as two numbers in a bizarre approach.
How? 4- and 8-byte numbers have completely different exponent lengths, so if we mistakenly view an 8-byte quantity as two 4-byte numbers, some exponent bits will probably be mistaken for mantissa bits, and the second quantity will probably be made fully out of mantissa bits from the 8-byte quantity.
Mr. Åkesson noticed that the spectra we’re testing encompass whole-number counts, nonetheless, and the mantissa of an IEEE floating-point quantity is left-justified; so so long as these counts are lower than 1,000,000 (lower than 20 bits,) all of the “content material” of the 8-byte quantity is on the left facet, and the remainder will probably be zeroes. Thus the second half of the double will probably be mistaken for a 4-byte worth 0.0.
In the meantime, the left facet will probably be confused for a 4-byte worth with the identical signal, a smaller exponent, and with some zero bits of the exponent handed alongside to the mantissa. This can be a nonlinear distortion that turns an integer into an actual quantity with a extremely squashed vary (as proven within the determine under, taken from Mr. Åkesson’s abstract.)
We’ll get to the squashing in a minute, however simply to hammer this residence: as a result of confusion concerning the float_t kind, an enter array of double values that appears like
< … a, b, c, d, e, f, g, h … >
…will probably be misinterpreted as
< … SQUASH(a), 0, SQUASH(b), 0, SQUASH(c), 0, SQUASH(d), 0 … >
…the place SQUASH is the operate plotted above. Each take a look at[] and reference[] are misinterpreted on this approach; the zeroes haven’t any impact on the correlation between the 2 vectors, and we are able to ignore them.
Now for the SQUASHing: you’ll observe that SQUASH(10) and SQUASH(1000) aren’t very far aside. Think about you take away most, however not all, of the fissile materials from a take a look at object. The impact on the array as seen by the spectral_contrast() operate will not be very a lot, owing to this flattening impact.
Utilizing this, a number nation’s recipe for beating the system is abandoning a small quantity of fissile materials, so there’s a tiny little bit of power, after which putting one thing else within the take a look at object that produces a considerable spectral signature in the proper half of the spectrum. This power in the proper half permits the spectrum to move the preliminary take a look at for whole power, after which the proper half of the spectrum is ignored. Then, the SQUASH impact ends in a take a look at[] and reference[] array that are not that completely different from each other, though there are very completely different quantities of fissile materials and really completely different magnitudes within the unique spectrum.
Why did we like this one?
- The assault is realistically achievable with out triggering some impact within the laptop, like tampering with a system clock or a file permission;
- It makes use of a real-world method to evaluating spectra, somewhat than one thing easy or contrived;
- In case you miss the kind confusion, there may be nothing in any respect concerning the remaining code that appears the slightest bit
suspicious or uncommon; - It will get right down to the bitwise illustration of floating-point numbers, and inflicting a confusion of datatypes with usable outcomes is ingenious;
- It exploits the truth that the doubles maintain complete quantity counts, which permits the miscasting of doubles to work;
- The assault is definitely clear to all of the filtering and preprocessing, which preserves the whole-number property of the info;
- Regardless of all this, it is nonetheless solely 60-odd strains of code, that appears extremely harmless.
Congratulations Linus Åkesson, you’re the most underhanded C programmer of 2015.
Posted at 11:59 pm by XcottCraver
Introduction
We hereby announce the eighth annual Underhanded C Contest – a contest that challenges coders to unravel a easy information processing drawback by writing innocent-looking C code that’s as readable, clear, and seemingly reliable as potential, but covertly implements a malicious operate. In some ways that is the precise reverse of the Obfuscated C Code Contest. The principle aim of this contest is to jot down supply code that’s simple and simply passes visible inspection by different programmers, however implements some particular underhanded conduct that can’t be simply detected.
Yearly, we suggest a brand new underhanded problem. Examples embody miscounting votes, shaving cash from monetary transactions, and leaking data to an eavesdropper — all of which display the necessity for cautious and formal code assessment in actual functions.
This 12 months, the Underhanded C Contest is happy to cooperate with the Nuclear Threat Initiative (http://www.nti.org/), a nonprofit, nonpartisan group working to scale back the specter of nuclear, chemical and organic weapons. This 12 months’s problem relies on an actual problem with nuclear arms management monitoring and verification applied sciences; we hope this can elevate consciousness of the complexity of such issues, and the crucial want for greatest coding practices on this space.
NTI is contributing to this 12 months’s contest in two thrilling methods: by providing a nicer prize than I can present on a professor’s wage, and by planning a subsequent joint programming contest after the common Underhanded contest finishes. This additionally implies that the 2015 contest judging will now proceed alongside a strict calendar, which will probably be much less irritating for everybody. Keep tuned for particulars on the follow-on contest – within the meantime, the 2015 problem is now open!
The 2015 Problem: Faking Fissile Materials
Two nations, the Peoples Wonderful Democratic Republic of Alice and the Wonderful Democratic Peoples Republic of Bob, have agreed to a nuclear disarmament treaty. In apply, that is carried out by nuclear inspectors visiting every nation and verifying the presence of fissile materials comparable to Plutonium in a warhead, at which level the warhead will be destroyed.
Ideally, the inspectors would topic a warhead to a scan and observe a graph comparable to a radiogram or a gamma ray spectrum of the article below take a look at, in order that they’ll verify the warhead comprises what it’s alleged to comprise. However each the PGDRA and the GDPRB are dissatisfied with this method as a result of the outcomes of those scans comprise delicate details about their nuclear packages and the design of their nuclear weapons. Nonetheless, every nation needs to make sure that the opposite nation is dismantling actual nuclear warheads, and never fakes.
To this finish, the 2 nations conform to construct a fissile materials detector with an “data barrier”—-essentially a pc program that may take the results of a scan, decide if it matches some reference sample, and output solely a “sure” or “no.”
Your aim is to jot down the operate that compares a take a look at measurement to a reference sample:
int match( double * take a look at, double * reference, int bins, double threshold )
…the place take a look at
and reference
are 1-dimensional arrays of width bins
, that encode a gamma ray spectrum of a take a look at warhead and a reference warhead, respectively. Your detector ought to output a 1 (“sure”) if these two arrays are sufficiently related, and 0 (“no”) in any other case; you need to use the parameter threshold
to regulate the sensitivity of the match.
You might be free to make use of any system to determine if take a look at[]
and reference[]
are a detailed match, so long as it’s justifiably correct: i.e., that two very related alerts match for an inexpensive threshold, and two clearly distinguishable alerts fail to match for some threshold. You’ll get extra factors for utilizing a standard comparability system or detector construction; one thing bizarre and ad-hoc could also be suspicious and onerous to justify.
Right here’s what you’ll be able to assume concerning the information:
- Sure components, together with plutonium, will produce a spectrum with peaks (considerably massive values) in among the array components;
- The fissile materials in a warhead has a sample of peaks that’s basically unimaginable to counterfeit with different components—-with bogus materials sure peaks is not going to be as excessive, and spurious peaks will probably be current.
- A whole absence of attention-grabbing components produces a reasonably boring baseline spectrum that isn’t secret; you’ll be able to assume the existence of a worldwide array
double baseline[]
that holds the anticipated values of this spectrum, which you’ll use in your detector. - In all observations, the measured spectra comprise some random noise and won’t match precisely.
- On account of registration or calibration points, a warhead could should be examined a number of instances towards a number of reference patterns.
- The reference sample or patterns are both saved securely, or decided on the spot by aiming the detector {hardware} at a reference warhead.
- Reference patterns are utterly reliable and cannot be doctored upfront.
The Underhanded Half
In some way, match()
ought to return a “sure” even when fissile materials is lacking, if triggered by sure circumstances that may be engineered by the host nation. To be extra particular, it ought to in some way return 1 for a take a look at[]
array that’s so in contrast to reference[]
that the detector ought to output 0. Furthermore, this could’t be a nasty detector total: it should accurately and persistently output 0 for arbitrary take a look at[]
vectors that don’t match reference[]
. Solely below uncommon particular circumstances ought to this false match come up.
The circumstances that set off the underhanded conduct are as much as you. They are often brought on by some attribute of the take a look at[]
information, some quirk of this system’s setting, and many others. An entry is price extra factors if the underhanded circumstances are simply engineered, hardly ever occur by chance, and don’t lead to suspicious code.
Scoring and Additional Factors
- Typically, submissions are price extra factors if they’re shorter and extra readable, as a result of it’s extra spectacular to cover a bug briefly, readable code.
- Errors primarily based on human notion, like mistaking an l for a 1, are price simply as a lot as “onerous” errors primarily based on pointer abuse or little-endian weirdness or quirks of C operate calls. The aim is a intelligent vulnerability that passes visible inspection, regardless of the mechanics of the underlying bug.
- Bugs are price extra factors if, as soon as found, they’re plausibly deniable as an harmless programming error.
- Errors are price extra factors if they continue to be innocent-looking below syntax coloring.
- Errors could also be price fewer factors if they’re processor or OS dependent, however provided that we have now to scavenge a system to check your bug. In case your bug works particularly below GNU/Linux, don’t fear about it. If it really works particularly below x86, no drawback. If it solely works on BeOS R5 on a twin G3 field when all 4 MIDI ports are energetic, then no.
- As all the time, further factors are awarded for humorous, spiteful, or ironic bugs, comparable to error-prone conduct in an error-checking routine.
Submission Pointers and Deadlines
Ship your submissions to underhandedC@gmail.com by November 15. Be sure you embody a proof of your code, however place this in a separate spoiler file in order that we are able to initially consider your entry in full ignorance of your method.
Aug. 15, 2015 | Contest opens |
---|---|
Nov. 15, 2015 | Submission deadline |
Jan. 15, 2016 | Outcomes of Judging |
Jan. 15, 2016 | Second section of contest introduced |
Prize
The winner of this 12 months’s contest will obtain $1,000. Winners and runners-up alike will probably be featured on the Underhanded C Contest net web page.
If there are any questions, please direct them to underhandedC@gmail.com