add_job_offer.py 34.3 KB
Newer Older
1
#!/usr/bin/env python3
2
# -*- coding: utf-8 -*-
3
4
5
6
7
"""
Job offer submission form.

WARNING: attachment has multiple copies in memory thus leading to a non
    negligible memory footprint.
Roland Denis's avatar
Roland Denis committed
8
9
10
11
12
13
14
15

FOR DEVELOPERS:

# General workflow (see process_form function)
1) The form data are put into a FormData instance
2) Form sanity is checked and a FormError is filled with potential errors
3) If there is an error, the form is displayed with additional error message
4) Otherwise, a job offer is filled using the JobOffer class
16
5) A new pull-request is created with the create_job_request function
Roland Denis's avatar
Roland Denis committed
17
18
19
20
21

# How to add a form field
1) if it can raises an error (ie an invalid form), first add a property to
    FormError class (contains the error message) and had a condition in the
    has_error method.
22
23
24
2) add a corresponding property to FormData class (with default value) and
    add (if necessary) a sanity check in the check method. Also check if
    the field is mandatory.
Roland Denis's avatar
Roland Denis committed
25
26
3) update the web form template and/or the update_html_form function.
    Don't forget to escape Jinja script so that the pelican pass don't
27
    remove it. Add `required` attribute and a `*` if necessary.
Roland Denis's avatar
Roland Denis committed
28
29
30
31
4) describe how to fill the FormData from the input request in the process_*
    functions.
5) add a corresponding property in JobOffer class
6) include this property in the id calculation (_calc_id method of JobOffer).
32
33
34
35
36
7) add a corresponding line in the job offer file template
    (content/job_offers/job_offer.md.template) and add corresponding line when
    launching the rendering (_render method of JobOffer).
8) update the Pelican job offer template (job_offer.html).
9) add corresponding line in the interface processors
37
   (process_cmdline and process_cgi so far).
Roland Denis's avatar
Roland Denis committed
38

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# How has SimpleMDE editor been customized ?
1) Firstly, please consider using another editor since it seems to be
    abandoned.
2) OK, as you want. Install SimpleMDE using the Node.js package manager :
    npm install simplemde --save
    (you may need to install dependencies, I don't remember...)
3) Fix some bugs that prevent "compilation" of SimpleMDE:
    * in src/css/simplemde.css, line 121-122 (in the .editor-toolbar a block),
        add a space before `!important` for `text-decoration` and `color`
        properties.
    * in src/js/simplemde.js, remove the backslash before `-` in each regular
        expression, ie replace `\-` by `-`. Should be the case for lines 75,
        174, 904, 905 and 1012.
4) Disable HTML in Markdown syntax: in src/js/simplemde.js, add after
    `var marked = require("marked");` line 15, the following line :
    marked.setOptions({sanitize: true});
5) "Compile" the package by running `gulp` (install it using npm) from the root
    of SimpleMDE.
6) Enjoy the unreadable files in dist subfolder.

59
60
61
62
63
# What about security ?
1) access to *.template files should be denied in the web server configuration
    (e.g. through a .htaccess file, like the one already there).
2) in case of security issue (e.g. this script has been displayed on client side),
    consider revoking and updating the GITLAB_TOKEN below.
64
65
66
67
68
69

# Common errors:
- if script fails with error
    "AttributeError: module 'magic' has no attribute 'from_buffer'"
    you may want to install python-magic module ;) (otherwise, import magic will
    load Magic-file-extensions).
70
"""
71

Calcul Bot's avatar
Calcul Bot committed
72
73
import datetime
import os
Calcul Bot's avatar
Calcul Bot committed
74
import re
75
import sys
76
import io
77
78
79
import base64
import itertools
import pprint
80
81
import mimetypes
import magic
82
83
84
import hashlib
import pickle
import uuid
Calcul Bot's avatar
Calcul Bot committed
85

86
###############################################################################
Calcul Bot's avatar
Calcul Bot committed
87
# General configuration
88
89

from gitlab_config import * # For Gitlab, see gitlab_config.py
90

Calcul Bot's avatar
PEP8    
Calcul Bot committed
91
TEMPLATE_PATH = './'
92
93
94
TEMPLATE_JOB_OFFER_FORM = 'job_offer_form.html.template'
TEMPLATE_JOB_OFFER = 'job_offer.md.template'

Roland Denis's avatar
Roland Denis committed
95
96
97
98
99
JOBOFFER_TYPE = {
    'cdi': 'CDI',
    'cdd': 'CDD',
    'postdoc': 'Post-doctorat',
    'these': 'Thèse',
100
101
    'stage': 'Stage',
    'concours': 'Concours',
Roland Denis's avatar
Roland Denis committed
102
103
}

104
PELICAN_JOB_OFFER_PATH = 'content/job_offers'
105
ATTACHMENT_MIME_TYPE = ['application/pdf']
106
JOBOFFER_EXPIRATION_DELAY = datetime.timedelta(weeks=3*4)
Calcul Bot's avatar
PEP8    
Calcul Bot committed
107

108
109
110
111
112
# The folder where files associated to each submission are created.
# Should be a folder that depends on the current website to avoid collisions.
FLOOD_PATH = "./flood"

# Global flood limits (for all submissions)
113
FLOOD_GLOBAL_TIMEOUT = datetime.timedelta(minutes=60)
Roland Denis's avatar
Roland Denis committed
114
FLOOD_GLOBAL_LIMIT = 10
115
116

# Local flood limits (for one filler)
117
FLOOD_LOCAL_TIMEOUT = datetime.timedelta(minutes=60)
Roland Denis's avatar
Roland Denis committed
118
FLOOD_LOCAL_LIMIT = 5
119

120
# Mail config in case of flood
121
MAIL_SMTP_SERVER = "172.16.101.1"
122
MAIL_SENDER = "calcul-owner@math.cnrs.fr"
123
MAIL_RECEIVERS = ["bureau.calcul@services.cnrs.fr"]
124
125


126
127
128
129
130
131
132
133
134
###############################################################################
class Debug(object):
    """ Debugging parameters. """

    def __init__(self, verbose=False, offline=False, local=False):
        self.verbose = verbose  # Explains all options
        self.offline = offline  # Do not connect to Gitlab
        self.local = local      # Print job offer files locally

135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
###############################################################################
class FloodChecker():
    """ Anti-flood protection """

    class FloodStats():
        """ Local or global flood stats """
        def __init__(self):
            self.first = datetime.datetime.max
            self.last = datetime.datetime.min
            self.cnt = 0

        def add(self, dt):
            self.first = min(self.first, dt)
            self.last = max(self.last, dt)
            self.cnt += 1

        def __repr__(self):
            return "FloodStats(first={}, last={}, cnt={})".format(
                pprint.pformat(self.first),
                pprint.pformat(self.last),
                self.cnt
            )


    def __init__(self):
        self.dt_now = datetime.datetime.now() # To have a consistent current time

    def _read_files(self):
        """ Read submission files and clean outdated ones """

        self.global_stats = self.FloodStats()
        self.local_stats = dict()

        # Submission file name pattern: hexdigest followed by an uuid
        pattern = re.compile("(?P<datetime>[0-9-]+)_(?P<id>[a-z0-9]+)_[a-z0-9-]+")

        for entry in os.scandir(FLOOD_PATH):
            if entry.is_file():
                file_name_match = pattern.fullmatch(entry.name)
                if file_name_match:

                    try:
                        file_mtime = datetime.datetime.strptime(file_name_match.group('datetime'), "%Y-%m-%d-%H-%M-%S-%f")

                        # Clean if outdated file
                        if file_mtime < self.dt_now - max(FLOOD_GLOBAL_TIMEOUT, FLOOD_LOCAL_TIMEOUT):
                            try:
                                os.remove(entry.path)
                            except OSError:
                                pass # File may have been deleted by other script instance
                            continue

                        # Add to global stats
                        if file_mtime >= self.dt_now - FLOOD_GLOBAL_TIMEOUT:
                            self.global_stats.add(file_mtime)

                        # Add to local stats
                        if file_mtime >= self.dt_now - FLOOD_LOCAL_TIMEOUT:
                            self.local_stats.setdefault(file_name_match.group('id'), self.FloodStats()).add(file_mtime)

                    except OSError:
                        pass # File may have been deleted by other script instance

    def _client_id_hexdigest(self, client_id):
        """ Return the hexadecimal digest of the given client identification data """
        return hashlib.sha1(pickle.dumps(client_id)).hexdigest()

    def _submission_delay(self, stats, timeout, limit):
        """ Return delay before a form can be submitted (0 if no delay) """
        if stats.cnt <= limit:
            return datetime.timedelta()
        else:
            return max(datetime.timedelta(), timeout - (self.dt_now - stats.first))

    def _global_submission_delay(self):
        """ Return delay before a form can be submitted (0 if no delay) by anyone """
        return self._submission_delay(
            self.global_stats,
            FLOOD_GLOBAL_TIMEOUT,
            FLOOD_GLOBAL_LIMIT
        )

    def _local_submission_delay(self, client_id):
        """ Return delay before a form can be submitted (0 if no delay) by a given client """
        return self._submission_delay(
            self.local_stats.get(self._client_id_hexdigest(client_id), self.FloodStats()),
            FLOOD_LOCAL_TIMEOUT,
            FLOOD_LOCAL_LIMIT
        )

    def _create_submission_file(self, client_id):
        """ Add a file to register a valid submission """
        # Submission file name is composed of
        # - the current datetime
        # - hexadecimal digest of the client identification data (to ease dump to string and increase condifendiality)
        # - a random suffix to avoid generating same file name in multiple instances.
        self.file_name = "{}_{}_{}".format(
            datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S-%f"),
            self._client_id_hexdigest(client_id),
            uuid.uuid4()
        )

        # Touch the file
        with open(os.path.join(FLOOD_PATH, self.file_name), 'w'):
            pass

    def _remove_submission_file(self):
        """ Remove previously created submission file """
        os.remove(os.path.join(FLOOD_PATH, self.file_name))

245
246
    def _read_and_calc_delays(self, client_id):
        """ Read submission files and return new submission delays for the given client """
247
248
249
250
251
252
253
        self._read_files()

        # Calculate submission delays
        global_delay = self._global_submission_delay()
        local_delay = self._local_submission_delay(client_id)
        delay = max(global_delay, local_delay)

254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
        return global_delay, local_delay, delay

    def _create_flood_report(self, client_id, global_delay, local_delay):
        """ Return message that describe current flooding state for the given client """
        return "[FLOOD] client_id={} client_hash={} global_stats={} global_delay={} local_stats={} local_delay={}".format(
            pprint.pformat(client_id),
            pprint.pformat(self._client_id_hexdigest(client_id)),
            pprint.pformat(self.global_stats),
            pprint.pformat(global_delay),
            pprint.pformat(self.local_stats),
            pprint.pformat(local_delay),
        )

    def approve_submission(self, client_id):
        """ Check if a new submission can be accepted. Return bool and waiting delay. """

        # First, check current flooding state so that to avoid spamming mail
        # if it has already been reported.
        global_delay, local_delay, delay = self._read_and_calc_delays(client_id)

        # If flooding was already reported, send new report only to webserver log
        if delay.total_seconds() > 0:
            flood_report = self._create_flood_report(client_id, global_delay, local_delay)
            print(flood_report, file=sys.stderr)
            return False, delay

        # Then, create a submission file and recheck flood state.
        # It avoids flooding at the exact same time.
        self._create_submission_file(client_id)
        global_delay, local_delay, delay = self._read_and_calc_delays(client_id)

        # If new flood is detected, send report to webserver log *and* send a mail.
        if delay.total_seconds() > 0:
            flood_report = self._create_flood_report(client_id, global_delay, local_delay)
            print(flood_report, file=sys.stderr)
289

290
291
292
293
            # Send report by mail
            import smtplib
            with smtplib.SMTP(MAIL_SMTP_SERVER) as smtp:
                smtp.sendmail(MAIL_SENDER, MAIL_RECEIVERS, "Subject: Flooding du formulaire d'offre d'emploi\n\n" + flood_report)
294
295
296

            return False, delay

297
298
        # No delay => OK
        return True, delay
299

300
301
302
###############################################################################
class FormData(object):
    """ Job offer data extracted from the form. """
Calcul Bot's avatar
Calcul Bot committed
303
304

    def __init__(self,
305
                 client_id=None, # Client identification data
306
307
308
                 title='',
                 description='',
                 author='',
309
                 employer='',
310
                 email='',
Roland Denis's avatar
Roland Denis committed
311
                 job_type='',
312
313
314
                 location='',
                 duration='',
                 website='',
315
                 expiration='',
316
                 attachment_content=None,
317
                 attachment_name=''):
318

319
        self.client_id = client_id
320
321
322
        self.title = title.strip()
        self.description = description.strip()
        self.author = author.strip()
323
        self.employer = employer.strip()
324
        self.email = email.strip()
Roland Denis's avatar
Roland Denis committed
325
        self.job_type = job_type
326
327
328
        self.location = location.strip()
        self.duration = duration.strip()
        self.website = website.strip()
329
        self.expiration = expiration if expiration else (datetime.datetime.now() + JOBOFFER_EXPIRATION_DELAY).strftime('%Y-%m-%d')
330
331
332
333
334
335
        self.attachment_content = attachment_content
        self.attachment_name = attachment_name

    def has_attachment(self):
        """ True if there is a given attachment. """
        return self.attachment_content is not None
336

337
338
339
340
341
342
343
344
345
346
347
348
    def check(self):
        """ Checks form validity and returns errors.

        If multi-language is needed, we could replace the
        error messages by codes.
        """

        errors = FormError()

        # Title must be set
        if not self.title:
            errors.title = 'Titre manquant'
Calcul Bot's avatar
Calcul Bot committed
349

350
351
352
353
        # Author must be set
        if not self.author:
            errors.author = 'Nom manquant'

354
355
356
357
        # Employer must be set
        if not self.employer:
            errors.employer = 'Employeur manquant'

358
359
360
361
362
        # Description must be set
        if not self.description:
            errors.description = 'Description manquante'

        # Checking email
363
364
365
366
367
        if self.email:
            if not re.match(r'^[^@]+@[^@]+\.[^@]+$', self.email):
                errors.email = 'Adresse mail invalide'
        else:
            errors.email = 'Adresse mail manquante'
368

Roland Denis's avatar
Roland Denis committed
369
370
371
372
        # Checking job type
        if self.job_type not in JOBOFFER_TYPE:
            errors.job_type = "Type d'offre d'emploi invalide"

373
        # Checking attachment type
374
375
376
377
378
379
380
        if self.has_attachment():
            # Getting mime type from magic numbers.
            attachment_mime = magic.from_buffer(self.attachment_content, mime=True)

            # Checking allowed mime types.
            if attachment_mime not in ATTACHMENT_MIME_TYPE:
                errors.attachment = 'Type de fichier invalide'
381

382
383
384
385
386
387
388
389
        # Checking expiration date
        try:
            expiration_date = datetime.datetime.strptime(self.expiration, '%Y-%m-%d')
            if expiration_date <= datetime.datetime.now():
                errors.expiration = "L'offre doit expirer dans le futur"
        except:
            errors.expiration = "Format de date invalide"

390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
        return errors

    def __str__(self):
        """ String representation with attachment length instead of his content. """
        return pprint.pformat(dict(itertools.chain(
            {(k, v) for k, v in vars(self).items() if k != 'attachment_content'},
            {('attachment_content_length', len(self.attachment_content) if self.has_attachment() else None)}
        )))


###############################################################################
class FormError(object):
    """ Error messages in the form. """

    def __init__(self,
Roland Denis's avatar
Roland Denis committed
405
406
407
                 title='',
                 description='',
                 author='',
408
                 employer='',
Roland Denis's avatar
Roland Denis committed
409
                 email='',
Roland Denis's avatar
Roland Denis committed
410
                 job_type='',
411
                 expiration='',
Roland Denis's avatar
Roland Denis committed
412
                 attachment=''):
413

Calcul Bot's avatar
Calcul Bot committed
414
415
        self.title = title
        self.description = description
416
        self.author = author
417
        self.employer = employer
418
        self.email = email
Roland Denis's avatar
Roland Denis committed
419
        self.job_type = job_type
420
        self.expiration = expiration
Calcul Bot's avatar
Calcul Bot committed
421
        self.attachment = attachment
Calcul Bot's avatar
Calcul Bot committed
422

423
424
425
    def has_error(self):
        """ True if there is any error set. """
        return (
Roland Denis's avatar
Roland Denis committed
426
427
428
            self.title
            or self.description
            or self.author
429
            or self.employer
Roland Denis's avatar
Roland Denis committed
430
            or self.email
Roland Denis's avatar
Roland Denis committed
431
            or self.job_type
432
            or self.expiration
Roland Denis's avatar
Roland Denis committed
433
            or self.attachment
434
435
436
437
438
439
440
441
        )

    @property
    def general(self):
        """ Error displayed at the top of the form. """
        if self.has_error():
            return 'Un ou plusieurs champs sont mal renseignés.'
        else:
Roland Denis's avatar
Roland Denis committed
442
            return ''
443
444
445
446
447
448
449

    def __str__(self):
        """ String representation. """
        return pprint.pformat(dict(itertools.chain(
            {('general', self.general)},
            vars(self).items()
        )))
450

451
452
453
454
455
456
457
458
459

###############################################################################
class GitlabFile(object):
    """ File commited to Gitlab repo. """

    def __init__(self, file_path, encoding, content):
        self.file_path = file_path
        self.encoding = encoding
        self.content = content
460

461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
    def __str__(self):
        """ String representation. """
        return pprint.pformat(vars(self))


###############################################################################
class JobOffer(object):
    """ Job offer formated for a Pelican website. """

    def __init__(self, form_data):

        # Copying form fields
        self.title = form_data.title
        self.description = form_data.description
        self.author = form_data.author
476
        self.employer = form_data.employer
477
        self.email = form_data.email
Roland Denis's avatar
Roland Denis committed
478
        self.job_type = form_data.job_type
479
        self.attachment_user_name = form_data.attachment_name
480
481
482
        self.location = form_data.location
        self.duration = form_data.duration
        self.website = form_data.website
483
        self.expiration = datetime.datetime.strptime(form_data.expiration, '%Y-%m-%d')
484
485

        # Attachment special care
486
        if form_data.has_attachment():
487
488
489
490
491
            # Ensuring valid extension
            attachment_mime = magic.from_buffer(form_data.attachment_content, mime=True)
            self.attachment_ext = mimetypes.guess_extension(attachment_mime)

            # Encoding attachment in base64
492
493
            self.attachment_content = base64.b64encode(
                form_data.attachment_content).decode()
494

495
496
497
498
499
500
501
502
503
504
505
        else:
            self.attachment_content = None

        # Submission date
        self.date = datetime.datetime.now()

        # Generate unique id
        self.id = self._calc_id()

        # Rendering job offer description
        self.main_content = self._render()
Calcul Bot's avatar
Calcul Bot committed
506

Calcul Bot's avatar
Calcul Bot committed
507
508

    def has_attachment(self):
509
        """ True if the job-offer has an attachment. """
510
        return self.attachment_content is not None
511
512
513
514

    @property
    def name(self):
        """ Base name of the job offer. """
515
        return 'job_{self.id}'.format(self=self)
516
517
518
519

    @property
    def file_name(self):
        """ Name of the job-offer description file. """
520
        return '{self.name}.md'.format(self=self)
521
522
523
524
525

    @property
    def attachment_name(self):
        """ Name of the job-offer attachment. """
        if not self.has_attachment():
526
            return ''
527

528
        return '{self.name}_attachment{self.attachment_ext}'.format(self=self)
529
530
531
532
533
534
535
536
537

    @property
    def blog_file(self):
        """ The main file of the blog entry. """
        return GitlabFile(
            file_path=PELICAN_JOB_OFFER_PATH + '/' + self.file_name,
            encoding='text',
            content=self.main_content
        )
Calcul Bot's avatar
Calcul Bot committed
538

539
540
541
    @property
    def blog_attachment(self):
        """ Return the job offer attachment. """
542
        if not self.has_attachment():
543
            return None
Calcul Bot's avatar
Calcul Bot committed
544

545
546
547
        return GitlabFile(
            file_path=PELICAN_JOB_OFFER_PATH + '/' + self.attachment_name,
            encoding='base64',
548
            content=self.attachment_content
549
        )
Calcul Bot's avatar
Calcul Bot committed
550

551
552
553
554
555
556
557
    @property
    def gitlab_files(self):
        """ The files to be commited in Gitlab repo. """
        yield self.blog_file

        if self.has_attachment():
            yield self.blog_attachment
558

559
560
561
562
    def _calc_id(self):
        """ Calculate job offer id. """
        # MD5 hash
        import hashlib
563
        m = hashlib.md5()
564
565
566

        # Feeding the hash algo with the job_offer fields
        m.update(b'|'.join(str(getattr(self, field)).encode() for field in
567
            ('title', 'job_type', 'author', 'employer', 'date', 'description', 'email', 'attachment_user_name', 'location', 'duration', 'website', 'expiration')
568
        ))
569

570
571
        # Feeding the hash algo with the job_offer attachment
        if self.has_attachment():
572
            m.update(b'|' + self.attachment_content.encode())
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595

        return m.hexdigest()

    def _render(self):
        """ Render a job offer to a Markdown file using Jinja2. """

        from jinja2 import Environment, FileSystemLoader

        # Jinja2 environment
        env = Environment(
            loader=FileSystemLoader(TEMPLATE_PATH),  # Path to the templates
            autoescape=False                         # Manual HTML escaping
        )

        # Adding custom filters
        env.filters.update({
            'markdown': filter_markdown,
            'pelican': filter_pelican,
            'linebreaks': filter_linebreaks,
            'datetime': filter_datetime
        })

        # Gets job offer template
596
        template = env.get_template(TEMPLATE_JOB_OFFER)
Calcul Bot's avatar
Tests    
Calcul Bot committed
597

598
599
600
601
602
603
        # Rendering
        return template.render(
            title=self.title,
            date=self.date,
            slug=self.name,
            attachment_name=self.attachment_name,
Roland Denis's avatar
Roland Denis committed
604
605
            description=self.description,
            job_type=JOBOFFER_TYPE[self.job_type],
606
            tag=self.job_type,
607
            author=self.author,
608
            employer=self.employer,
609
610
611
612
            email=self.email,
            location=self.location,
            duration=self.duration,
            website=self.website,
613
            expiration=self.expiration,
614
615
616
617
618
        )


###############################################################################
# Jinja2 custom filters
Calcul Bot's avatar
Tests    
Calcul Bot committed
619

Calcul Bot's avatar
Calcul Bot committed
620
621
622
623
def filter_markdown(text):
    """ Escapes all special characters of Markdown syntax. """
    return re.sub('([' + re.escape(r'\`*_{}[]()#>+-.!|') + '])', r'\\\1', text)

Calcul Bot's avatar
PEP8    
Calcul Bot committed
624

Calcul Bot's avatar
Calcul Bot committed
625
626
627
628
def filter_pelican(text):
    """
    Escapes special characters of Pelican while keeping most of
    the Markdown syntax.
629
630
631
632
633
634
635
636

    {} to avoid {filename}, {attach} and such special tags
    |  since it is the old syntax for {}
    >  (blockquote) since it will be escaped for html
    !  for inline image links.

    TODO: also escape named links ?
    TODO: modify SimpleMDE configuration accordingly
Calcul Bot's avatar
Calcul Bot committed
637
    """
638
    return re.sub('([' + re.escape(r'{}>|!') + '])', r'\\\1', text)
Calcul Bot's avatar
Calcul Bot committed
639

Calcul Bot's avatar
PEP8    
Calcul Bot committed
640

Calcul Bot's avatar
Calcul Bot committed
641
642
643
644
def filter_linebreaks(text, replacement='<br>'):
    """ Replaces newlines and carriage returns by the given string. """
    return re.sub('[\r\n]+', replacement, text)

Calcul Bot's avatar
PEP8    
Calcul Bot committed
645
646

def filter_datetime(date, date_format='%Y-%m-%d %H:%M'):
647
    """ Format a date and time. """
Calcul Bot's avatar
PEP8    
Calcul Bot committed
648
649
    return date.strftime(date_format)

Calcul Bot's avatar
Calcul Bot committed
650

651
###############################################################################
652
def create_job_request(job_offer, debug):
Calcul Bot's avatar
Calcul Bot committed
653
654
655
656
    """ Creates a merge request for the given job offer.

    TODO: generates job id in this function.
    """
Calcul Bot's avatar
PEP8    
Calcul Bot committed
657

658
    # Connecting to Gitlab
gouarin's avatar
gouarin committed
659
660
    gitlab_private_token = GITLAB_TOKEN
    if gitlab_private_token is None:
661
662
        print("[ERROR] GITLAB_PRIVATE_TOKEN environment variable not set!", file=sys.stderr)
        raise
Matthieu Boileau's avatar
Matthieu Boileau committed
663

664
    if debug.verbose:
665
        print('[DEBUG] Gitlab connection to {} '.format(GITLAB_URL) +
Matthieu Boileau's avatar
Matthieu Boileau committed
666
              'with token {}'.format(gitlab_private_token), file=sys.stderr)
667
668

    if not debug.offline:
669
        import gitlab
Matthieu Boileau's avatar
Matthieu Boileau committed
670

Calcul Bot's avatar
PEP8    
Calcul Bot committed
671
672
        gl = gitlab.Gitlab(
            GITLAB_URL,
Matthieu Boileau's avatar
Matthieu Boileau committed
673
            private_token=gitlab_private_token,
Calcul Bot's avatar
PEP8    
Calcul Bot committed
674
675
            api_version=4
        )
676
677

    # Checking connexion ?
Calcul Bot's avatar
Calcul Bot committed
678

679
680
    # Branch name associated to this job offer
    branch_name = job_offer.name
Calcul Bot's avatar
Calcul Bot committed
681
682

    # Accessing project
683
    if debug.verbose:
684
        print('[DEBUG] Accessing bot projet {}'.format(GITLAB_SOURCE_ID), file=sys.stderr)
685
686

    if not debug.offline:
Calcul Bot's avatar
PEP8    
Calcul Bot committed
687
        project = gl.projects.get(GITLAB_SOURCE_ID)
Calcul Bot's avatar
Calcul Bot committed
688

Calcul Bot's avatar
Calcul Bot committed
689
    # Creating new branch
690
    if debug.verbose:
691
        print('[DEBUG] Creating branch {} from master'.format(branch_name), file=sys.stderr)
692
693

    if not debug.offline:
Calcul Bot's avatar
PEP8    
Calcul Bot committed
694
695
        branch = project.branches.create({
            "branch": branch_name,
696
            "ref": GITLAB_REF_BRANCH
Calcul Bot's avatar
PEP8    
Calcul Bot committed
697
        })
698

Calcul Bot's avatar
Calcul Bot committed
699
700
    # Creating commit
    data = {
Calcul Bot's avatar
PEP8    
Calcul Bot committed
701
702
        "branch": branch_name,
        "commit_message": "Adding new job offer {}".format(branch_name),
703
        "actions": []
Calcul Bot's avatar
Calcul Bot committed
704
705
    }

706
707
    # Adding files to the commit
    for gitlab_file in job_offer.gitlab_files:
Calcul Bot's avatar
Calcul Bot committed
708
709
        data['actions'].append({
            'action': 'create',
710
711
712
            'file_path': gitlab_file.file_path,
            'encoding': gitlab_file.encoding,
            'content': gitlab_file.content
Calcul Bot's avatar
Calcul Bot committed
713
714
715
        })

    # Committing
716
    if debug.verbose:
Roland Denis's avatar
Roland Denis committed
717
        print('[DEBUG] Committing:', file=sys.stderr)
718
719
720
721
        print(pprint.pformat(data), file=sys.stderr)

    if not debug.offline:
        commit = project.commits.create(data)
Calcul Bot's avatar
Calcul Bot committed
722

Calcul Bot's avatar
Calcul Bot committed
723
    # Creating merge request
724
    pr_data = {
Calcul Bot's avatar
Calcul Bot committed
725
        'source_branch': branch_name,
Calcul Bot's avatar
PEP8    
Calcul Bot committed
726
727
        'target_branch': GITLAB_TARGET_BRANCH,
        'target_project_id': GITLAB_TARGET_ID,
Calcul Bot's avatar
Calcul Bot committed
728
        'title': 'New job offer {}'.format(branch_name),
729
        'description': job_offer.main_content.replace("\n", "  \n"),
730
731
        'remove_source_branch': True,
        'labels': GITLAB_LABELS
732
733
    }

734
    if debug.verbose:
Roland Denis's avatar
Roland Denis committed
735
        print('[DEBUG] Creating PR:', file=sys.stderr)
736
737
738
        print(pprint.pformat(pr_data), file=sys.stderr)

    if not debug.offline:
739
        project.mergerequests.create(pr_data)
740
741
742
743
744
745
746
747
748
749
750
751
752
753


###############################################################################
def local_write_job(job_offer):
    """ Write job offer locally. """

    for gitlab_file in job_offer.gitlab_files:
        if gitlab_file.encoding == 'text':
            with open(os.path.basename(gitlab_file.file_path), 'w') as f:
                f.write(gitlab_file.content)
        else:
            with open(os.path.basename(gitlab_file.file_path), 'wb') as f:
                f.write(base64.b64decode(gitlab_file.content.encode()))

754

755
###############################################################################
756
def update_html_form(form_data=FormData(), errors=FormError(), debug=Debug(), internal_error=False, success=False, flood_error=False, flood_delay=None):
757
758
    """ Update the form with error messages. """

Roland Denis's avatar
Roland Denis committed
759
    from jinja2 import Environment, FileSystemLoader
Roland Denis's avatar
Roland Denis committed
760

Roland Denis's avatar
Roland Denis committed
761
762
763
764
765
    # Jinja2 environment
    env = Environment(
        loader=FileSystemLoader(TEMPLATE_PATH),  # Path to the templates
        autoescape=True                          # Auto HTML escaping
    )
766

Roland Denis's avatar
Roland Denis committed
767
768
    # Gets job offer template
    template = env.get_template(TEMPLATE_JOB_OFFER_FORM)
769

Roland Denis's avatar
Roland Denis committed
770
    # Allowed attachment types
Roland Denis's avatar
Roland Denis committed
771
    file_accept = (
772
773
774
775
776
777
778
        ','.join([ext
            for mime_type in ATTACHMENT_MIME_TYPE
            for ext in mimetypes.guess_all_extensions(mime_type)])
        + ','
        + ','.join(ATTACHMENT_MIME_TYPE)
    )

Roland Denis's avatar
Roland Denis committed
779
780
781
782
    # Rendering
    print(template.render(
        form=form_data,
        errors=errors,
Roland Denis's avatar
Roland Denis committed
783
        file_accept=file_accept,
784
785
786
        job_type_list=JOBOFFER_TYPE,
        internal_error=internal_error,
        success=success,
787
788
        flood_error=flood_error,
        flood_delay=flood_delay,
Roland Denis's avatar
Roland Denis committed
789
    ))
790

791
792


793
794
795
###############################################################################
def process_form(form_data, debug=Debug()):
    """ Check and submit job-offer. """
796

797
    errors = FormError()
798

799
800
801
802
    try:
        # Displaying form data in debug mode
        if debug.verbose:
            print('[DEBUG] Form data: {}\n'.format(form_data), file=sys.stderr)
803

804
805
806
807
808
        # Checking form validity
        errors = form_data.check()

        if debug.verbose:
            print('[DEBUG] Form errors: {}\n'.format(errors), file=sys.stderr)
809

810
811
812
813
        if errors.has_error():
            update_html_form(form_data, errors, debug)
            return

814
815
816
817
818
819
        # Checking undergoing flood
        is_flood_ok, flood_delay = FloodChecker().approve_submission(form_data.client_id)
        if not is_flood_ok:
            update_html_form(form_data, errors, debug, flood_error=True, flood_delay=flood_delay)
            return

820
821
822
823
824
825
826
827
        # Creating job offer
        job_offer = JobOffer(form_data)

        # Creating pull request
        if debug.local:
            local_write_job(job_offer)
        else:
            create_job_request(job_offer, debug)
828

829
830
        # Success page
        update_html_form(debug=debug, success=True)
831

832
833
834
835
836
837
    except Exception as error:
        # Internal error
        import traceback
        print("[ERROR] Error while submitting job offer: {}".format(error), file=sys.stderr)
        print("[ERROR] {}".format(traceback.format_exc()), file=sys.stderr)
        update_html_form(form_data, errors, debug, internal_error=True)
Calcul Bot's avatar
Calcul Bot committed
838
839


Roland Denis's avatar
Roland Denis committed
840

841
842
843
844
###############################################################################
def process_cmdline():
    """ Process command-line arguments. """

Calcul Bot's avatar
Calcul Bot committed
845
846
847
848
    import argparse

    parser = argparse.ArgumentParser()

849
    # Debug options
850
    debug_parser = parser.add_argument_group('debug arguments')
Calcul Bot's avatar
PEP8    
Calcul Bot committed
851
    debug_parser.add_argument(
852
853
854
        '-v', '--verbose',
        action='store_true',
        help='output process informations'
Calcul Bot's avatar
PEP8    
Calcul Bot committed
855
856
    )
    debug_parser.add_argument(
857
        '-o', '--offline',
Calcul Bot's avatar
PEP8    
Calcul Bot committed
858
859
860
        action='store_true',
        help='Disable Gitlab connection and pull-request creation'
    )
861
862
863
864
865
    debug_parser.add_argument(
        '-l', '--local',
        action='store_true',
        help='Print job offer files locally (implies offline)'
    )
866
867

    # Form options
868
    form_parser = parser.add_argument_group('form arguments')
Roland Denis's avatar
Roland Denis committed
869
870
871
872
873
    form_parser.add_argument(
        '--type',
        default='',
        help='Offer type within ' + ','.join(JOBOFFER_TYPE.keys())
    )
Calcul Bot's avatar
PEP8    
Calcul Bot committed
874
875
    form_parser.add_argument(
        '--title',
876
        default='',
Calcul Bot's avatar
PEP8    
Calcul Bot committed
877
878
879
        help='Offer title'
    )
    form_parser.add_argument(
880
881
882
        '--author',
        default='',
        help='Offer author name'
Calcul Bot's avatar
PEP8    
Calcul Bot committed
883
    )
884
885
886
887
888
    form_parser.add_argument(
        '--employer',
        default='',
        help='Employer'
    )
Roland Denis's avatar
Roland Denis committed
889
890
891
892
893
    form_parser.add_argument(
        '--email',
        default='',
        help='Offer author email'
    )
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
    form_parser.add_argument(
        '--location',
        default='',
        help='Job working location'
    )
    form_parser.add_argument(
        '--duration',
        default='',
        help='Job duration'
    )
    form_parser.add_argument(
        '--website',
        default='',
        help='Web page'
    )
909
910
911
912
913
    form_parser.add_argument(
        '--expiration',
        default='',
        help='Offer expiration date'
    )
Calcul Bot's avatar
PEP8    
Calcul Bot committed
914
    form_parser.add_argument(
915
916
917
        '--description',
        default='',
        help='Offer description'
Calcul Bot's avatar
PEP8    
Calcul Bot committed
918
    )
919
920
    form_parser.add_argument(
        '--attachment',
921
        default='',
922
923
        help='File attachment'
    )
924
925
926
927
928
    form_parser.add_argument(
        '--client_id',
        default=None,
        help='Client id'
    )
929
930
931

    # Parsing arguments
    args = parser.parse_args()
Calcul Bot's avatar
Calcul Bot committed
932

933
934
935
    # Set offline to true if local is true
    args.offline = args.offline or args.local

936
937
938
939
    # Debug options
    debug = Debug(verbose=args.verbose, offline=args.offline, local=args.local)

    # Reading attachment
940
    if args.attachment:
941
942
        with open(args.attachment, 'rb') as f:
            attachment_content = f.read()
943
        attachment_name = os.path.basename(args.attachment)
944
945
    else:
        attachment_content = None
946
        attachment_name = ''
947

948
    # Creating form data
949
    form_data = FormData(
Roland Denis's avatar
Roland Denis committed
950
        job_type=args.type,
951
        title=args.title,
952
        author=args.author,
953
        employer=args.employer,
Roland Denis's avatar
Roland Denis committed
954
        email=args.email,
955
956
957
        location=args.location,
        duration=args.duration,
        website=args.website,
958
        expiration=args.expiration,
959
        description=args.description,
960
        attachment_name=attachment_name,
961
962
        attachment_content=attachment_content,
        client_id=args.client_id,
963
964
    )

965
966
    # Continue submission process
    process_form(form_data, debug)
Calcul Bot's avatar
Calcul Bot committed
967

Calcul Bot's avatar
PEP8    
Calcul Bot committed
968

969
970
971
972
###############################################################################
def process_cgi():
    """ Process cgi arguments. """

973
974
975
976
977
978
    import cgi

    # Accessing CGI form data
    cgi_form = cgi.FieldStorage()

    # Default form data
979
    form_data = FormData()
980
981
982

    # Filling form fields
    if 'title' in cgi_form:
983
        form_data.title = cgi_form.getlist('title')[-1]
984

985
    if 'author' in cgi_form:
986
        form_data.author = cgi_form.getlist('author')[-1]
987

988
    if 'employer' in cgi_form:
989
        form_data.employer = cgi_form.getlist('employer')[-1]
990

991
    if 'email' in cgi_form:
992
        form_data.email = cgi_form.getlist('email')[-1]
993
994

    if 'description' in cgi_form:
995
        form_data.description = cgi_form.getlist('description')[-1]
996

Roland Denis's avatar
Roland Denis committed
997
    if 'job_type' in cgi_form:
998
        form_data.job_type = cgi_form.getlist('job_type')[-1]
999

1000
    if 'location' in cgi_form:
1001
        form_data.location = cgi_form.getlist('location')[-1]
1002
1003

    if 'duration' in cgi_form:
1004
        form_data.duration = cgi_form.getlist('duration')[-1]
1005

1006
    if 'website' in cgi_form:
1007
        form_data.website = cgi_form.getlist('website')[-1]
Roland Denis's avatar
Roland Denis committed
1008

1009
    if 'expiration' in cgi_form:
1010
        form_data.expiration = cgi_form.getlist('expiration')[-1]
1011

1012
1013
    # Checking attachment
    # TODO: checking 'done' attribute for transfer error
1014
    if 'file' in cgi_form:
1015
        file_item = cgi_form["file"]
1016
1017
        if file_item.filename:
            form_data.attachment_name = file_item.filename
Roland Denis's avatar
Roland Denis committed
1018
            form_data.attachment_content = file_item.file.read(-1)
1019

1020
1021
1022
    # Client identification data (IP)
    form_data.client_id = os.getenv("REMOTE_ADDR")

1023
1024
1025
1026
1027
1028
1029
1030
    # Checking if the form was submitted
    if 'submit' in cgi_form:
        # Continue submission process
        process_form(form_data)
    else:
        # Displaying form without errors, possibly pre-filled through
        #   "GET" parameters (eg http://.../add_job_offer?job_type=cdd)
        update_html_form(form_data)
1031

Calcul Bot's avatar
PEP8    
Calcul Bot committed
1032

1033
1034
1035
###############################################################################
# TODO: Tornado interface

Calcul Bot's avatar
PEP8    
Calcul Bot committed
1036

1037
1038
1039
1040
###############################################################################
if __name__ == '__main__':

    # See https://stackoverflow.com/questions/11842547/distinguish-from-command-line-and-cgi-in-python
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
    if 'GATEWAY_INTERFACE' in os.environ:

        # Set UTF8 encoding for standard output (to avoid CGI encoding errors)
        # See https://stackoverflow.com/a/14860540
        sys.stdout = io.TextIOWrapper(
            sys.stdout.detach(),
            encoding='utf8',
            errors='scrict',
            line_buffering=sys.stdout.line_buffering
        )

        # HTTP header
        print('Content-type: text/html; charset=utf-8')
        print('')

1056
        process_cgi()
1057

1058
1059
    else:
        process_cmdline()