10,000 TPS per second

I ran across Kiplinger’s article and picture of “The Credit or Debit Debate Visualized” It is a very nice picture of both the usage of Credit and Debit Cards over time, as well as a nice list of pros and cons and differences between each, I encourage you to check it out for a good basic summary.

In payment systems I don’t really care as much about the type of card, Credit vs Debit at a basic level to me — one has PIN’s and requires usage of HSM’s and require real-time reversals and one uses clearing files and the other reconciliation files. But I digress.

At the bottom of the picture there are some statistics, Kiplinger being a personal finance website and not to mention various “Letters“, these are mostly consumer related, but his one caught my eye:

ABA-TPS.png

Which makes me chuckle because lots of prospects tell us they need a system to be able to support the worlds average TPS (Transaction Per Second), or a small fraction of.

You don’t know until you know (or go into Production)

images-2.jpgOver the last six months we have been busy building and implementing an OLS.Switch Issuer Implementation with one of our customers and their banking and payment processing partners. It has been a process of reviewing and implementing message specifications, business processing requirements, authorization rules, clearing, settlement, flat file and reporting requirements. We also filtering external messages into our IMF – Internal Message Format based on ISO8583 v2003, build an interface Card Management functions via our local API’s and message sets. Building client simulators and trying to faithfully reproduce what happens when you are connected to a real system.

Testing on test systems is the next step – replacing our client simulators with other “test” systems that are driven by simulators by the processing gateway we interfaced to. Those simulators have limitations – in their configured test suites or test scripts, some require manual entry to send original data elements for subsequent transaction types, (e.g completions and reversals). We generate clearing and settlement files and match those to on-line test transactions, and our use cases.

After on-line testing, you connect to an “Association” test environment to do “Certification” and run a week’s worth of transactions through a wider test bed. Then you are certified, your BIN goes live and then you enter a production pilot mode – where you watch everything like a hawk.

You can do all of the simulated testing for both on-line transactions and off-line clearing and settlement files that you want – when you connect to the real world and do your first pilot transaction that is where most likely you will see something that wasn’t simulated, tested, or even included in certification, it happens. You need to be proactive, set-up reviews and manual interventions, perform file generation when you have staff available to review the output before it is released for further processing.

What have we seen :

  • Test environments that are not as robust as production or not setup with up-to-date releases.
  • Certain real-world examples are hard to simulate – reversals, time-outs.
  • Thinly-trafficked transactions: (chargeback, representment)…people can’t even define these much less create them in test
  • Poor or incorrect documentation of message specifications.
  • You receive Stand-In Advices or other transactions on-line that you don’t see in testing or certification.

Production pilot is a very important phase of testing – It is where you discover and address the < 1% of issues nobody catches in prior project life-cycles. What can happen, WILL happen. What you think might be something that will occur infrequently will bite you sooner, not later.

Encryption options from the POS/Payment Device/Terminal

matrix-computer-screen

There are a few different ways of implementing "encryption" from the POS/Payment Device/Terminal, I though I’d look at a few in a short post:

1) Tunneling – using existing applications and their connections over an encrypted tunnel   e.g.over a VPN, SSH, stunnel, etc. This approach doesn’t require any changes to devices or message formats or the payment "server"

2) Transport level – using TLS/SSL over TCP/IP Sockets – or at a higher level (web server/web service) using HTTPS. – devices needs to support the ability and make this type of connection, message formats are not modified.

3) Data Element or Field level — if you only want to encrypt the PAN or other specific fields, and these fields are defined to support the increased length required of the encrypted payload  — this requires changes to the message formats, devices and payment "server" software. Consider truncating the Account Number/Track Data in DE 2 or DE 35 in ISO8583  for use of displaying purposes on the terminals screen or receipt and consider using another Private ISO field for the payload.

The approach will depending on what the "devices" sending the transaction can support, both from a connection perspective as well as a software perspective. I’d also recommend consider using asymmetric encryption rather than symmetric here, as then the devices would not have the ability to "decrypt" as they would not have the private/secret key, and would help with eliminating private key storage at the device level if you choose option 3. There are implementations that use HSM’s and the DUKPUT algorithm as well.

We have an implementation of #3 that I wrote about here. — relevant paragraph below:

Some of our implementations of OLS.Switch supports field or data element level encryption that is passed on from the Point of Sale system to our switch. The main thing that allow us to perform this is that:  We or our customer "own/control" the POS message format to us and can adapt and handle the programming of the POS System and POS message formats – our account number fields are not limited to 16 digits – we can handle a much large encrypted value. So over the wire – these implementations are "protected" from eavesdropping or sniffing.

I plan to write more on E2EE (End to End Encryption) in the coming weeks as well, so stay tuned !

Operations Considerations Batch Files and Extract and Import jobs

ibm_data_processing_0

I’ve been working some batch file based jobs for a project here for OLS. There are two sides of this; sending "clearing" files of transactions in a defined format to third party which I will call the extract, and receiving "refresh" or table updates from this third party, I’ll refer to this as the import. The extract file contains financial transaction records, and the import file contains entity information such as merchant information. The Extract File Layout is quite standard an looks something like:

File Header

—-Merchant Header

——–Batch Header

————Detail and Addendum Record(s)

————Detail and Addendum Record(s)

——–Batch Trailer

—-Merchant Trailer

File Trailer

 

The File Layout for Import is a list of fields per record, nothing real fancy here.

I don’t want to spend too much time about the actual mechanics of the Extract and Import jobs themselves, but rather the Operational Considerations of this and others that we have performed:

Validity — You need to decide how to handle invalid records in a file or valid records without proper supporting data (transactions for a merchant that wasn’t setup in your system) ? You can write off the bad record to an exception file, and address it later, or you can reject the full file, the approach depends by implementation and requirements. We also mark files with a .bad extension if we detect they are invalid to help prevent subsequent processing steps, like transmitting a half-baked file. We also perform duplicate file checking as well as other validation steps.

Completeness — You need to make sure that you read in the entire file or extract a complete file. Monitoring controls such as checking the number of lines and file size in an Extract file, as well as checking the last time of a file for a specific record such as a File Trailer. Reconciliation between hash totals and amounts is also a good practice. On the import side you can count the number of lines or records in a totals from a trailer record and compare that to what was imported.

Timeliness — Some extracts take minutes and others hours, scheduling and monitoring the process is essential to perform data processing on a timely basis to other parties. Monitoring "check-points" in the process as well as a % of records completed help here to detect problems proactively with monitoring. Collect job performance metrics, it is valuable to keep track of and chart the total run time of each job and compare it to its history, to detect slowdown’s or to correlate increase or decrease in process times with external events or transaction growth.

Delivery — Consideration for the delivery of the file must be made. File Transfer procedures that address file naming conventions, steps to upload a complete file (upload as a .filepart extension and the rename to the full name upon complete transfer) as well as secure delivery as well as archiving locally or remotely, compression, and any file level encryption. It is also a good practice to reconnect to the file server and perform a directory listing on the files that you uploaded to confirm that they were transferred successfully.

Security — While account numbers and such are encrypted in databases (column level, file level, internal database level) the file specifications don’t allow for encrypted card-numbers, so both file level asymmetric encryption using the public key of the recipient as well as transport level encryption to send the file (see Delivery above) need to be considered. Archival files stored on disk will also need to be stored encrypted as well.

Troubleshooting/Restart Procedures — You need to develop procedures to support the following:

· re-sending failed files

· re-running the extract or import process for a specific date

· preventing duplication or invalid files or data.

The End is Just the Beginning — Operations is just the start of a process that has no end, it requires daily care and maintenance. These processes and controls need to work in harmony on a continuous basis and be able to be enhanced based upon the results of monitoring and other operational tasks.

When End-to-End Encryption is really not End-to-End.

I’m reading a lot about solutions that implement end-to-end encryption, where account numbers and track data is encrypted and can utilize a Hardware Security Module (HSM) and DUKPT or other encryption algorithms from the point-of-sale. I thought it important to share what data is actually encrypted in the payment system.

 

Here is a list in no particular order:

Gateways:

(contact me and I’ll add you if you are not listed)

 

Most of these are ISO’s that sell you a merchant account and access to their gateway that uses "end-to-end" encryption and that it will shift the PCI and PA-DSS burden from you to them, if you are a merchant, some claim you don’t even need to go through PCI compliance because you don’t have access to the card numbers or the encryption keys to decrypt the cards (Please also see this post on this subject).  This is all really good stuff, I’ve written about End-to-End Encryption before and am a big proponent of it. This can help prevent "sniffers" and card capturing malware from capturing track data and account numbers in the clear as they traverse your internal network. Attackers would either need to install card skimmers or gain access to encryption keys, or use brute force methods against captured encrypted data to capture data at your store.

But it isn’t really End-to-End Encryption.

Let look at two examples:

  1. A typical small merchant using a payment gateway
  2. A large retailer or processor/gateway that uses a payment switch

 

A typical small merchant that uses a payment gateway:

Slide2

 

A large retailer or processor/gateway that uses a payment switch

Slide1

( uses leased lines to connect directly to a Payment Processor (FDR, Chase/PaymentTech, Fifth Third, MPS, etc ) or Interchange Network (VisaNet, BankNet, etc )

Let’s say that you are using a gateway or even a switch that supports an encrypted message format from the point-of-sale (POS). The area in RED in each diagram shows where the account number traverses the payment networks in clear text. At the small merchant example from the Gateway to the rest of the network – the account number and track data and CVV2/CVC2 data is sent in the clear. In the direct connect model with the Payment Switch (which actually just operates as a local gateway) from the payment switch to the rest of the network. So End-to-End is really not End-to-End at all. (it depends on where you define end :)  This should also explain why End-to-End Encryption in its current state would not of prevented the breach at Heartland Payment Systems – as a processor they need to connect and communicate over the interchange networks using TCP/IP connection and ISO-8583 messages to these endpoints.

 

Why is this ?  The Payment interchange networks and message formats that processors and the Interchange networks use does not support this in their current message formats (primarily ISO-8583) There is no room in the current implementations of Visa’s Base1, MasterCard’s MIP, or FDR’s message formats for example. Data Elements can be added to support this, but would require massive changes to Payment Systems infrastructures and systems.

 

Does any one have any solutions for this ? Please provide comments below — I’ll provide a follow-up blog post with some of my ideas.

 

Remember that End-to-End is really not End-to-End, it may shift or transfer some of the compliance "burden"  from the merchant to that of the processor, but still exists in clear text on private networks and at processors.  Oh, and tokenization and secure card vaults would work the same way here, the cards need to be translated to their raw value to ride the payment networks.

Card Readers in Vending Machines

Years ago I assisted a company that developed magstripe readers that would operate in vending machines, copiers, laundry machines for a project related to college campus cards.  My part was to assist them with both the message formats, connection methods, as well as selecting transaction types and device captures modes (Host Based Capture works the best in this model, BTW) for integration to a payment switch and authorization host and ultimately certifying the different devices.

 

While I was in Dallas last week I took a snap shot of a vending machine that had a similar device:

12132008059

These are not new, but I don’t visit Vending Machines like I used to and don’t see that many Vending Machines that accept payment cards. This appears to be a model from USA Tech called the ePort. I got a water and coke for a total of $3.00, btw 🙂

OLS – Company Profile in The Green Sheet

logo3In Issue 081201 of The Green Sheet There is a Company Profile of OLS (On-Line Strategies, Inc)

Hugh Bursi, Director of Marketing of OLS worked with the Green Sheet to put this company profile together. Although the article doesn’t reference me or my 12 years of Payment Experience  (Which is small compared to Hugh and Andy’s ) at a Third Party Processor as Director of Technology and Development where I worked with both Insuring and Acquiring Bank’s and ISO’s, it is a great article and great to be in The Green Sheet!

Sending Alerts to the SysLog

12-5-2008 10-23-24 AMOn my jPOS page I added a link to a screencast that I did that shows the basic configuration and usage of the jPOS SysLogListener.

If you are not familiar with syslog, it is the logging daemon for Unix and Linux. There are implementation’s for MS Windows as well, such as Kiwi Syslogd. (and some that peel entries from the Windows/NT Event Log and forwards then to a centralized syslog server) Many alerting systems are based off of syslog events where you can define an action to call an external program/script, send an email/page/SMS notification. Or you can even use splunk as a syslog daemon and "google" your logs.

 

Enjoy the screencast.

 

 

 

 

New Blog : Prepaid Enterprise

money card Andy and I recently persuaded our new colleague (and resident expert on Issuing, among many other thing) : Randy San Nicolas to start a blog.  Well he did and he wrote his first post.  Check it out at http://prepaidenterprise.typepad.com/ and soon to be:  http://www.prepaidenterprise.com/

We also did a short "mini-cast" on the Payment Systems Podcast on the topic of Issuing Card Program Management, I’ll post that as soon as I can get it produced.

Welcome Randy!

 

Product Release Cycle

updateI’ll just say it; I’m proud of our release cycle for OLS.Switch.

 

It has been my experience (YMMV) both first hand running an authorization host/switch (issuing and acquiring) and as an IT Security Auditor and QSA – that either Core Banking applications or Payment Switches fall into one of the following when it relates to upgrades, changes, or security updates:

  • "The Vendor set it up, we don’t touch it"
  • "We don’t patch it because we are afraid"
  • "We cringe everything we need to install a new release of the software"
  • "Last time we did an upgrade, we had x amount of downtime"
  • "It all goes smooth like clockwork"  🙂

 

During Vulnerability Assessments and Penetration Testing on the Internal Networks that I performed– My observations from an operating system, database and application perspective – these systems are typically not keep current or run on a platform that the organization is not very familiar which and relies on outside support. The application was not cohesive to the rest of their operating environment: systems, technologies, and procedures.

 

Installing new releases of our software (or rather our clients installing releases of new software) is something that does not make me cringe. (and I used to not sleep very well in the past)  At least one of our clients seems to agree as well. (See Andy’s "A very simple platform to support")

 

We just rolled out a new release that was quite large (see Flexible Spending Accounts (New Initiatives, Part 3) and had changes that impacted pretty much every transaction path due to partial authorization and credit reversal support, and required heavy regression testing. Our agile based SDLC is a big help with this, we have very iterative development processes and frequent testing which also means less large bulking updates that break everything.

 

Another success factor is our simplicity of upgrading our program code and binaries. It is really as simple as:

  • Stop the OLS.Switch Service or Daemon
  • create a backup copy of the directory or file path where OLS.Switch is installed
  • Start the OLS.Switch Service or Daemon
  • Perform test transactions and monitor.
  • The Back-Out plan is to stop the service and revert back to the backup copy.

 

Further, System Implementation Design can have a big impact on Up-time;  we run multiple independent application servers behind load balancers, that allows us to gracefully stop an application – which stops accepting new transactions when finishing to process those in its queue, and the load balancer stops routing transactions to this application server. Allowing an upgrade to be made while other application servers are still processing transactions. Uptime, not system uptime, but uptime processing transactions doesn’t have to suffer for "scheduled maintenance", or security related patches and reboots.

 

I think we have a low-risk upgrade/update path that our clients are very comfortable with – So in 7 months into the year – we have had a dozen releases to add additional functionality, address endpoint changes, and implement new transaction types.