- Sanchke's Blog

Debianizing The Library

In response to this post.. Let this post be my ‘build log’. For those of you who are not familiar how I write, essentially every paragraph is a step of thought. Its better to read it as a script.

Using a Debian repository for my PHP software Library

I’m making the library have a Debian repository backend. If any money was at risk for this move, I’d invest in this strategy. It seems more and more big software companies (which are Linux compatible) are simply hosting their own repository… Spotify to name one off the top of my head. Discord should do it too.

… About half way though this I just realized how important this project is. I’m starting to think a proper working library will save my company and help conduct the rise of many other software firms after me… There’s certainly a market here.


Let’s begin.

First off I’d like to point to my disclaimer that I wont release too much information regarding the Library for my companies interest.

Let’s being by making everything nginx, nothing against Apache, this website is just too small to worry about XML config files (screw XML btw). I’d have to update the git repository to my new standards.

I’ll copy the configure script from another website into this one to save some time. Same goes for nginx templates, and makefile templates. Saves me about 30 minutes. Let me reveal that my configure scripts are written in perl. I will admit I learned perl ironically, unfortunately I fell in love with it, allows me to see how the world programmed 4 centuries ago. Nonetheless sudo make install && nginx -s reload and we’ve got 200s.

Okay configure worked on first try. Now I need to adapt the .php layout to my new format. I love me a good file layout. But at this moment I need to ask myself if prepending PHP scripts is good practice, it seems like its “hiding” code in scripts. I suppose I’ll move the prepending scripts into the ‘include’ folder. BETTER YET, I’ll clean out the prepending scripts and take their respective contents and place them into different modules. I’m looking in this code and find no reason to keep alot of it.

Proper PHP Error module

I can’t count how many times I’ve redone a fucking error/log module. This will be the last time I’ll do it in PHP. The best way I’ve found to report client errors is simply by appending each error to an array stored on the session. Then print out the list somewhere on the page (after the error-prone logic of course). I stash the print-out in the footer and use JavaScript to move the error somewhere more obvious. I could also place it at the bottom of the header. When the print-out is done, you must clear the error list otherwise the user will see the same set of errors over and over again on every page.

This is useful to when you’ve got several redirections and/or separate POST pages that send Location headers. Because the client get’s to see the full error report regardless of where they end up.

The Library has a fuck ton of error reports so it’s time I won’t fight that just yet.

Proper templating (headers/footers) module

In this build log I’m going to go on a tangent here regarding templates…

I’m about to release a truth bomb on everyone. you should NEVER have a template module. I’ve tried them, I’ve made them, I’ve tested others, and my experience has led me to the Orthodox solution of templates in PHP.

How NOT to do it:

$template = new Template($title, $description, $meta);

The right way…

include 'header.php';
    Easy as that.
include 'footer.php';

If you think I’m wrong, contact me so I can tell you why you’re an idiot and you should find another profession.

After refactoring the templating in the Library I’ll begin the next phase.

Databases in PHP

I’m smart so I’ve always got my models at hand. shouldn’t be to hard to reconstruct the database. However I will lack test data. Another thing I’d like to make out is integrating databases in PHP code. I’ve been thru many different techniques. I’m not sure what is the best but I have 3 solutions:

dodbstuff($db, $args) {...}         // 1
dodbstuff($args) {global $db; ...}  // 2
dodbstuff($args) {$db=getDB(); ...} // 3

They all have their take gives. With no. 1 client programmers are aware of which database is being used. No. 2 and No. 3 are the same in respects to scope. However no. 3 provides a level of abstraction between getting the PDO and using it, and rather if that abstraction is necessary determines.

We’ll be using no. 3 because this is a larger project. I’d be fine to use no. 2 though.

The Library’s standards, debian repo

Really weird, the first time I made this I tried mimicking the debian repository. Little did I know the right idea was to use the repo instead of reinventing it. The naming standards are all weird. And the folder standards are all weird too. I look back to my original goals.

After taking some research into how debian repos work. I’m once again turned on about its simplicity. The only ‘database’ required is a file system hierarchy… let me put it into graphic form for all of those following my footsteps.

Format of debian package archive/repository

Looking at this, if I wanted to make a repository by hand it would be a nightmare. For instance I’d have to re-compress Packages.gz and Packages.xz every time I wanted to add a package. This isn’t even accounting for indicies format which is even more political. Implementing this is going to be more of a drag than I thought it would be. No wonder this is not a popular solution.

But I’ve got an idea, if I can get the file system layout made and have tools for adding, removing, and configuring packages I think I can pull this off. As far as adding the functionality of “private packages” (packages that one must have special credentials for), I can implement a system at the .deb download request. Sure you’ll be able to see what all files are in the private package, but I’ll worry about that when I need to. Regardless, let me get looking for good tools to use to implement this.

And there’s one! See this handy tutorial. The tutorial is very nifty, it does not mention though, the process of creating the InRelease file. I’ll put it right here for those who may need (including me down the line).

gpg -u KEYID --clearsign -sao dists/unstable/InRelease dists/unstable/Release

Remember that you need to generate a key. And if you don’t know how to use GPG or what it is, then man you’ve got some learning to do.

I’ll be using apt-ftparchive. I’ve just tested it on my local machine and it acts how I want it too.

Private Component

The first problem I face is having an ‘authentication’ module that fires only for some packages. This will be easy to implement over HTTP, however the FTP-Debian protocol will require more thinking. My first idea is to just have 2 repositories, one for proprietary and another for free. But typing that out loud just reminded me that I don’t need two repositories, I need two components. I know this because of Debian splits it up into free and non-free components.

Now that I know I need 2 components, I need to make sure that no one but the authenticated can have access to what’s inside non-free (a “private component”). An nginx/apache proxy domain will do well, I can simply force all requests under dists/ellem/non-free to run auth.php or something like that. What concerns me is how the Linux Mint has a Contents-amd64.gz in their dists folder. If I were to also have that, it could hint to whats inside of private components. Thankfully this practice is out dated based on this statement on DebianRepositoryFormat:

The files dists/$DIST/$COMP/Contents-$SARCH.gz (and dists/$DIST/$COMP/Contents-udeb-$SARCH.gz for udebs) are so called Contents indices. The variable $SARCH means either a binary architecture or the pseudo-architecture “source” that represents source packages. They are optional indices describing which files can be found in which packages. Prior to Debian wheezy, the files were located below “dists/$DIST/Contents-$SARCH.gz”.

This pretty much gives me the green light for private components.

TODO: How do I authenticate people when they use apt?

Integration with current file system, later developments.

Now lets not forget about the other half of systems, RedHat. For right now, I’ll just be doing the debian repository but I will leave room for future implementation.

Conclusively we can draw the diagram of how files will be retrieved.

The Ellem Library design overview

As you can see, I’m trying to prevent redundancy of debian packages by copied inside and outside the repository. One thing I did not account for, that I don’t think will be much of a problem is documentation packages. I could store documentation packages in the repository and have the website simply read the .deb and display them as html. Not hard.

Front-end design

The current design of the Library is pretty sexy if I do say so myself. The LMCSS project actually originated for the Library. I think I’ll merge it with the branch. However the only thing I’d like to address is integrating the debian/control file into the product page. My logic is if the page detects that the product has a .deb file, access it’s debian/control and display it, I can print out the raw file because of how debian prefers everything to be in ascii. I will make the look of this information in a module-style box, that way I can do the same preview for .msi and .rpm down the line. Not to mention the possibility of multiple .debs.

Getting to the code.

Looking in this code I forgot how elegant it was created all the logic is in that perfect between efficiency and organization. My lack of comments will cripple my progress though. Nonetheless I can see the function filefs_pathc and it’s comment says ‘returns the download path to the file.’ But there’s no way it can be that easy to just have started this whole project to modify only one function. I think I’ll start by adding a ‘location’ attribute, an enum to describe if it’s in the file system, or debian repository. I’ll have to rethink the architectural. Let see… there’s files and there’s products. Products can have many files. So I’ll have to find out where the files identify themselves as far as where they are. And there’s no way to tell. filefs_pathc only returns the directory as to where the file is. What it needs to do is return the exact location of the file. That gives us our task. I’m thankful that there’s only one script that uses filefs_pathc and it is the download script, so the swap-out will be easy peasy!

At this point I’m dealing with this ‘filename’ convention I’ve been enforcing on myself. I’ve been trying to match the debian naming standards. But I have not done a good job. I’ll have to refractor a few things here and there to patch things up. Specifically the ‘build’ reversion. I’m gonna go ahead and make an entry in the file so I can reference it through out the code. Because right now it seems to high up in the air.

As I write this documentation I realize that the majority of my abbreviations are because I can’t spell them: arch, dep, ect, ect.

Should I objectify?

This is a question I ask on nearly every project and I’ve yet to find a reasonable answer. I can’t even flag it as a “it depeneds” answer yet.

The question at hand is should I make objects that represent rows out of the database, or, should I just work with the rows in array form? The questions is between code readability and amount of code respectively. This is interesting because the less code there is, the more readable.

// working with rows, fetch(FETCH_ASSOC)
$row = array(
    'id'       => 5
    'name'     => 'kevin'
    'genderid' => 19

// working with objects
class person
    public $id;
    public $name;
    public $genderid;
    function fromrow($row) : person
        $this->id       = $row['id'];
        $this->name     = $row['name'];
        $this->genderid = $row['genderid'];

As you can see, the class/object form requires a ‘pivot function’ to convert the row into an object. This function’s line count is equal to the amount of columns. I’m starting to believe working with objects is less preferred for that same reason, for instance if we were to add another column, we’d add a new attribute and a new line to the fromrow function. I’ll stick with working with rows.

The product tarball

I’m thinking about ditching the database all together, the Library’s traffic is so minute I think it’s better if I just read from the file system. Or at least get a minimal database that only stores title and description of products for searching reasons. As far as the files go I’ll think I’ll pack them, along with the cover files into a tarball. Furthermore, to prevent data duplication, I’ll have a file in the tarball called links. Links will be an ascii list of files and where to download them. This way I can have debs located solely in the repository. The below design regarding the structure of the product tarball. You may use this design freely;

The Ellem Product Tarball

Unfortunately, PHP does not have a favorable system to read tar files. I’ve only found tar.gz operations. I’ve seen Phar but this does not have the ability to read the contents of files of the tar directly into a variable. This would be a good project to implement. The below sections were originally typed up for the readme of the project, but I’ll paste them here.

You the control file’s arguments are such (case sensitive):

argument required? Description
Product yes The product name (see below)
Product-Title yes The presentable title of the product
Packages no A space-delemeted list of included packages
Description yes Descripiton of the product

Product name vs Product Title

The product name is different from the product title. The name is mostly for computers asn the title is for humans. The name be alpha-numaric, lowercase, and can include dashes (“-”) and rarely dots (“.”). The title may include anything as it’s the presentable name. Examples:

Product: apple-counter9
Product-Title: The great apple counter (version 9)

Control syntax

The control file is delimited by unix new-lines (“\n”). Each line either begins with a field name or a space (“ ”). If it was a space than it means it signifies the continuation of the field before. Thus the space is useful for description. An example is seen below:

Product: apple-counter9
Product-Title: The greate apple counter (version 9)
Packages: apple-counter9-common libac apple-counter9-bin
Description: Let's say this is a long description and
 you can use a new-line and indent the next line with a
 space. Note you can do this for any field.

It should also be noted that blank lines are ignored.

file naming standard (codename “lm.n”)

The PDL file naming standard derives from the debian naming standard (<package-name>_<epoch>:<upstream-version>-<debian.version>-<architecture>.deb). With a few additional rules:

file extension description requires debian.version? requires architecture?
.deb debian package yes yes
.msi windows package yes yes
.rpm redhat package yes yes
.orig.tar.gz source code no no
.doc.* documentation no no


I’m going to do something different and that is to implement ‘tools’ with my project. Specifically I want a user-executable script that will update the library’s database based on the files in the file system. This was I can plug it into a cron job, or pretty much do it by hand. I’ll make the tools in PHP for convenience sake. They’ll just never be used via the browser. This was a very quick build. Made the tool on the first draft. My next objective would be to see all the data in action via the HTML.

Product index page

The product index page is going to be the bane of the workload. I’ve just finished the website index and it wasn’t that hard at all. However the project index I’m going to have to modify every call to the old database. And I’ve just made a discovery while refractoring. And that is “abstraction by specification”. Instead of making “link generation” functions outside of the html page, I’d make them inside of it. This makes link generation specific to this only page, so if I were to search for the functionality on a later day, I know exactly where it is rather than shuffling through abstraction layers.

Thanks too…

  1. The usual communities who’s technology is accelerating the world: git, debian, gnu, linux, apache, nginx, ubuntu, linux mint
  2. devtodo. Though I’m still using version 1, I think your tool is sexy, clean, fast, and elegant. Thankgod v2 uses json. XML anything is 5 decades out of date.

What’s next?

I want to integrate a git server into the Library as well. Something that can handle git request too.

Ideas that stem from this project